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\  Abstract 

"^This  report  presents  the  results  of  a  study  of  softv;are  metrics  applied  to 
various  classes  of  COBOL  programs.  Particular  attention  is  given  to  the 
software  science  metrics  of  Halstead,  which  were  applied  to  hundreds  of  COBOL 
programs  written  by  students  at  Ohio  State  University,  as  well  as  several 
production  COBOL  programs.  The  results  include  support  of  the  inclusion  of 
Data  Division  in  the  software  science  counting  strategy,  nonsupport  for  the 
use  of  the  software  science  language  level  metric,  and  the  identification  of  a 
weakness  in  the  ability  of  the  Halstead  E  measure  to  capture  integration 
effort. 


Several  proposed  complexity  metrics  were  compared  for  their  ability  to 
predict  actual  development  effort,  with  none  of  the  metrics  studied  behaving 
in  an  impressive  manner.  Some  approaches  for  refining  existing  complexity 
metrics  to  overcome  their  apparent  weaknesses  are  suggested. 
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1.  Introduction 


Software  Metrics,  as  a  branch  of  Software  Engineering,  plays  an  important 
role  in  the  analysis  and  evaluation  of  software.  Many  complexity  metrics  for 
computer  programs  have  been  developed.  There  are  two  categories  into  which 
many  of  the  more  popular  metrics  can  be  divided.  The  first  category  may  be 
termed  lexical  metrics,  which  are  based  on  the  counts  of  various  lexica1 
tokens  in  the  system.  This  category  includes  Halstead's  software  science 
metrics  [3i  and  McCabe's  cyclonatic  complexity  metrics  15].  The  second 
category  of  metrics  deals  with  the  system  connectivity  by  observing  the  flow 
of  control  or  information  among  the  system  components.  Recent  work  of  Henry 
and  Kafura  [4l  using  information  flow,  and  the  chunk  model  complexity  measure 
by  Davis  [1]  fall  into  this  category. 

This  report  presents  the  results  of  a  study  of  the  software  metrics  with 
special  emphasis  on  Halstead's  software  science  metrics.  The  area  of  software 
science  has  been  explicitly  studied  by  many  independent  research  groups. 
Since  many  of  the  experimental  results  reported  by  Halstead  and  others  have 
been  very  encouraging,  these  metrics  have  received  considerable  attention  from 
the  computer  science  community.  Most  of  the  work  in  applying  metrics  of 
computer  software  using  the  methodology  of  software  science  has  concentrated 
on  relatively  few  programming  languages  such  as  Fortran  and  PL/l.  COBOL  has 
received  relatively  little  research  attention  with  two  notable  exceptions. 
Zweben  and  Fung  til)  reported  the  results  of  a  preliminary  study  of  COBOL 
programs  which  were  counted  manually.  The  work  of  Zweben  and  Fung  [11 J 
initiated  the  writing  of  a  software  science  analyzer  [2]  for  further  study  in 
software  science  metrics  in  a  COBOL  environment.  The  use  of  this  analyzer 
helps  to  collect  a  large  amount  of  data  on  COBOL  programs.  The  analyzer 
provides  a  mechanical  way  of  counting  the  tokens  (operators  and  operands)  of  a 
COBOL  program,  and  hence  can  produce  all  of  the  software  science  statistics. 
The  software  metrics  research  group  at  Purdue  University  has  done  perhaps  the 
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most  comprehensive  study  of  COBOL  programs  using  software  science  16 j.  Borne 
aspects  of  our  study,  particularly  those  described  in  the  next  chaplet.  ,  are 
similar  to  theirs,  though  different  programs,  programmers  and  analyses  were 
performed.  This  report  will  present  the  results  of  the  analysis  of  a  very 
large  number  of  COBOL  programs  collected  from  various  sources. 

This  report  is  divided  into  five  main  chapters.  The  next  chapter  dea!s 
with  the  verification  of  the  software  science  metrics  using,  a  large  number  of 
COBOL  programs.  The  chapter  has  been  divided  into  four  sections.  A  brief 
review  of  software  science  (3]  is  presented  in  the  first  section.  The  second 
and  third  sections  are  concerned  with  the  results  of  the  analysis  of  the  COBOL 
programs  collected  from  two  different  sources,  namely  undergraduate  students' 
programs  at  The  Ohio  State  University  (OSU)  and  COBOL  programs  written  by 
production  programmers  at  the  University  Systems  Computer  Center  of  OSU.  The 
last  section  shows  the  result  of  the  comparison  of  the  software  science 
statistics  for  some  COBOL  programs,  which  were  run  through  two  different  COBOL 
analyzers  —  one  developed  at  OSU  and  the  other  produced  by  the  software 
science  research  group  at  Purdue  University.  The  third  chapter  has  been 
included  to  show  the  relationships  among  four  different  software  metrics, 
namely  Halstead's  Effort  metric,  McCabe's  cyclomatic  complexity  metric, 
Kafura's  information  flow  complexity  metric  and  Davis'  chunk  model  complexity 
metric.  The  primary  motivation  of  this  chapter  is  to  study  the  concept  of 
effort  and  its  relation  to  development  time  of  software.  The  four  different 
complexity  metrics  were  evaluated  for  a  small  set  of  COBOL  programs  in  order 
to  assess  the  relative  complexities  of  these  programs.  The  summary  of  the 
results  of  comparing  these  four  metrics  is  shovm  at  the  end  of  this  chapter. 
The  weaknesses  of  these  metrics  motivated  consideration  of  a  new  model  for 
software  effort.  This  is  outlined  in  Chapter  4.  Finally,  the  closing  chapter 
suggests  additional  work  to  be  done  on  this  topic. 
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Two  appendices  have  been  included  for  completeness  of  the  report.  Appendix 
A  describes  the  counting  strategy  used  to  calculate  the  information  flow  and 
chunk  model  complexities  for  COBOL  programs,  together  with  the  explic:t 
calculations  required  for  the  results  of  Chapter  3.  Appendix  B  gives  the 
major  steps  of  the  calculations  carried  out  to  obtain  the  results  shown  n 


Chapter  4. 


2.  Verification  of  the  Software  Science  Metrics  in  COBOL  Environment 


This  chapter  deals  with  the  verification  of  the  software  science  metrics 
using  a  large  number  of  COBOL  programs  collected  from  various  sources. 

2.1  Review  of  the  Software  Science  Metrics 

In  software  science,  a  computer  program  is  considered  to  be  a  string  of 
tokens  which  are  divided  into  "Operators"  and  "Operands".  Generally,  any 
symbol  or  keyword  group  in  a  program  that  specifies  an  algorithm  action  of  the 
computer  is  considered  an  operator,  and  any  symbol  used  to  represent  data  is 
considered  an  operand.  All  software  science  measures  are  functions  of  the 
counts  of  the  operators  and  the  operands. 

The  basic  metrics  in  software  science  are  defined  as: 

n^  =  number  of  unique  operators 
n2  =  number  of  unique  operands 
Nj  -  total  occurrences  of  operators 
N2  =  total  occurrences  of  operands. 

The  length  of  the  program  is  defined  as 
N  =  Nx  +  N2 

and  the  vocabulary  of  a  program  is  defined  as: 
n  =  n^  +  n2 • 

All  other  metrics  are  defined  in  terms  of  these  basic  terms  and  are  shown 
oelow. 

The  estimated  length  is  defined  by  the  length  equation: 

K  “  n^  log  n^  +  n2  log  n2-  (2.3) 

A  suitable  metric  for  measuring  the  size  of  the  program,  called  volume .  is 


(2.1) 


(2.2) 


given  by 
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V  -  N  log2  n  bits. 


(2.4) 


Intuitively,  the  volume  is  the  minimum  number  of  bits  necessary  to 


represent  a  complete  program.  The  minimum  possible  volume  that  an  algor itiin 
can  take  is  known  as  its  potential  volume,  denoted  by  V*.  By  definition, 


V*  -  (2  +  n2*)  log2  (2  +  n2*), 


(2.5) 


where  n-i*  =  number  of  I/O  parameters. 


In  terms  of  V  and  V*,  a  metric  called  program  level  L  of  implementation  of 


an  algorithm  can  be  defined  as 


L  -  — ,  0  <  L  £1. 


(2.6) 


An  approximation  to  this  definition  of  L,  expressed  in  terms  of  the  number 


of  operators  and  operands  used  in  the  program,  is  denoted  by 


„  no 
t  -  -2_  *_1 
L  "  N  * 

1  2 


12.7) 


The  inverse  of  the  program  level  is  termed  the  difficulty.  D.  That  is, 


D  -  A. 


(2.3; 


Therefore,  as  the  volume  of  an  implementation  of  a  program  increases,  the 


program  level  decreases  and  the  difficulty  increases. 


A  metric,  suggested  by  Halstead  to  characterize  a  programing  language,  is 


called  the  language  level  A  and  defined  as 


A  =  L  *  V*  “  L^V. 


(2.9) 


Finally,  a  metric  referred  to  as  Effort  is  defined  by  the  ratio 


e  -  jl-  i. 


L  V* 


(2.10) 


From  the  definition,  it  is  clear  that  the  effort  required  to  implement  a 
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computer  program  increases  as  the  size  of  the  program  increases.  Therefore, 
since  V*  is  fixed  for  a  given  algorithm,  software  science  predicts  that  highei 
level  languages  reduce  the  effort  of  programming. 

An  estimate  of  E  can  be  obtained  using  the  estimate  of  program  level.  That 

is, 

t  =  V/t.  (2.11) 

The  effort  statistic  has  been  interpreted  as  a  measure  of  the  mental  effort 
required  to  create  a  program.  In  other  words,  E  represents  the  number  of 
mental  discriminations  or  decisions  that  a  single,  fluent,  concentrating 
programmer  should  make  in  implementing  the  algorithm. 

According  to  Halstead  [3],  the  programming  tine  should  be  direct iy 
proportional  to  the  effort  in  a  program.  That  is, 

T  =  |  Sec,  (2.12) 

where  S  denotes  the  rate  of  mental  activity  of  the  programmer;  i.e.,  S  refers 
to  the  number  of  mental  discriminations  per  second  of  which  the  programmer  is 
capable.  A  value  of  S=18  has  been  used  in  previous  research  in  software 
science. 


2.2  Analysis  of  Students'  Programs 

This  section  shows  the  results  of  the  analysis  of  a  large  number  of  COBOL 
programs  written  by  students  at  Ohio  State  University. 
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The  data  were  collected  by  the  use  of  a  software  science  COBOL  analyzer  L  2  J 
developed  by  the  software  metrics  research  group  at  Ohio  State.  The  analyzer, 
written  in  PL/l,  counts  operators  and  operands  in  the  Data  and  Procedure 
divisions,  and  computes  all  software  science  statistics  for  the  entire 
program. 


The  existence  of  this  analyzer  facilitates  the  collection  of  a  substantial 
amount  of  data  from  the  students  of  various  undergraduate  courses.  Each 
student  uses  a  simple  command,  called  ANALYZE,  to  run  his/her  COBOL  program 
through  the  analyzer.  The  outputs  (software  science  statistics)  of  the 
analyzer  are  stored  on  disk.  At  the  end  of  each  quarter,  a  final  report 
containing  the  software  science  statistics  of  all  the  students'  programs  is 
generated  for  analysis.  The  results  of  the  analysis  of  all  the  data  collected 
during  six  consecutive  quarters  are  presented  in  this  section.  Various  kinds 
of  analysis  were  performed  as  described  below. 

For  each  particular  program  the  following  software  science  metrics  were 
evaluated : 

N,  N,  X,  L,  D  and  E. 

Each  of  the  metric  values  shown  in  the  table  represents  the  mean  of  all 
such  values  for  a  given  program  during  a  given  quarter,  obtained  from  the  set 
of  programs  written  by  different  subjects. 

Programs  written  by  the  students  of  two  different  undergraduate  courses  are 
considered  in  the  present  analysis.  One  course  is  an  introduction  to  Data 
Processing  (CIS  212),  and  the  other  course  deals  with  the  introduction  to  File 
Processing  (CIS  313).  CIS  212  is  the  introductory  COBOL  course,  and  CIS  313 
is  the  next  course  in  sequence  (also  using  COBOL).  In  the  first  course,  we 
examined  six  assignments  (Lab  2,  3,  4A,  4b,  5,  6).  Labs  2,  3  and  4A  are  of 
increasing  complexity  (in  terms  of  size  and  problem  concepts),  and  in  fact, 
each  lab  is  an  extension  of  the  previous  lab.  In  other  words,  knowledge  and 
understanding  of  Lab  2  is  helpful  for  writing  Lab  3,  and  knowledge  of  Lab  3  is 
somewhat  directly  useful  to  completing  Lab  4A  since  both  involve  the  use  of  a 
matching  algorithm.  Lab  4B  deals  with  manipulating  one-  and  two-dimensional 
arrays,  and  is  conceptually  different  from  the  previous  three  labs.  Labs  3 
and  6  are  of  almost  the  same  complexity,  although  they  solve  two  independent 


problems.  Lab  5  involves  sorting  and  Lab  6  is  very  similar  to  a  report 
generator  program.  In  the  Fall  of  1982,  the  curriculum  was  modified  so  that 
there  were  only  five  assignments  instead  of  six.  In  particular,  Labs  3  and  4A 
of  the  previous  quarters  were  merged  into  a  single  assignment  called  Lab  3. 
This  new  Lab  3  has  the  identical  function  as  old  Lab  4£ .  All  other 
assignments  were  kept  unchanged. 


The  second  course  contains  three  assignments  involving  COBOL.  The  first 
program  is  a  simple  file  listing  program.  The  second  assignment  deals  with 
input  data  validation,  and  the  third  program  updates  a  product  master  file  by 
making  changes  to  several  fields  (e.g.,  product  descriptions  and  prices, 
adding  new  products  and  deleting  old  products,  etc.).  These  three  programs 
are  also  of  increasing  complexity,  although  each  assignment  is  a  direct 
extension  of  the  previous  one. 


The  first  set  of  analyses  will  involve  the  Halstead  length  equation  and 

A. 

language  level  metrics,  so  that  we  will  be  interested  in  II,  N,  and  A  ,  y,Te  wi;l 
compute  the  mean  error  and  mean  absolute  relative  error  in  the  length  equation 

A  A 

for  each  assignment.  The  mean  error  is  defined  by  (N-N)/N,  where  N  and  N 
represent  the  mean  values  of  the  N^'s  and  N;'s,  respectively,  for  all  the 
subjects  performing  the  same  assignment.  corresponds  to  the  value  of  N  for 

ith  subject.  The  mean  absolute  relative  error  is  defined  as 


1 

n 


n 

I 

i=l 


NrNi 


where  n  is  the  number  of  subjects  doing  one  particular  assignment.  The 
results  of  thece  analyses  as  obtained  in  six  different  quarters  for  the  course 

CIS  212  are  shown  in  the  following  tables. 


LAB  2  (^subjects  =19) 


N 

A 

N 

n-n/n 

Mean  Abs . 
Rel.  Err. 

X 

Data  Div. 

643 

1129 

-0.75 

0.67 

31.3 

Proc.  Div. 

350 

519 

-0.48 

0.50 

2.88 

Program 

993 

1351 

-0.36 

0.36 

2.19 

LAB  3  (^subjects  =  12) 


N 

/V 

N 

n-k/n 

Mean  Abs. 
Rel •  Err. 

X 

Data  Div. 

1117 

1739 

-0.56 

0.59 

42.4 

Proc.  Div. 

719 

902 

-0.25 

0.28 

2.34 

Program 

1836 

2101 

-0.14 

0.17 

1.36 

LAB  4A  Of 

subjects  =  19) 

N 

A 

N 

N-K/N 

Mean  Abs. 
Rel.  Err. 

X 

Data  Div. 

1408 

2121 

0.51 

44.97 

Proc.  Div. 

1251 

0.18 

1.78 

Program 

2477 

2559 

-0.03 

0.07 

1.30 

Table  1 


Spring  119811,  CIS  212 


N 

✓s. 

K 

N-K/N 

Lean  Abs. 
Re! .  Err. 

X 

Data  Div. 

833 

1317 

-0.58 

0.60 

35.50 

Proc.  Div. 

389 

578 

0.51 

2.69 

Program 

1222 

1573 

-0.29 

0.30 

2.06 

LAR  3  Vi'sutjects  -  22) 


N 

A 

N 

n-k/k 

Mean  Abs. 

Re I .  Err. 

X 

Data  Div. 

1032 

1583 

-0.53 

0.55 

37.00 

Proc.  Div. 

640 

849 

-0.33 

0.35 

2.14 

Program 

1659 

1936 

-0.17 

0.18 

1.54 

LAB  4A  1 -'.subjects  =  9) 


N 

A 

N 

If-N/K 

Mean  Abs. 
Rel.  Err. 

X 

1474 

2310 

-0.57 

0.58 

51.24 

1050 

1337 

-0.27 

0.28 

2.04 

2525 

2744 

-0.09 

0.10 

1.43 

Data  Div. 
Proc.  Div 
Procran 


Table  3 


Summer  11981 J,  CIS  212 


14 


I 


LAB  02  (.^subjects  =  65) 


K 

/\ 

N 

N-N/N 

Mean  Err. 

Mean  Abs. 
Rel.  Err. 

A 

Data  Div. 

695 

1135 

29.8 

Proc.  Div. 

336 

523 

2.7 

Program 

1031 

1412 

-0.36 

0.35 

2.0 

LAB  03  ( ^subjects  = 

76 ) 

N 

/\ 

M 

N-K/K 

Mean  Abs. 

Mean  Err. 

Rel.  Err. 

X 

Data  Div. 

1082 

1652 

-0.52 

0.54 

37.7 

Proc.  Div. 

592 

837 

-0.41 

0.43 

2.3 

Program 

1674 

1993 

-0.19 

0.20 

1.7 

LAB  4A  1  is 

ubjects 

70) 

N 

A 

M 

N-K/K 

Mean  Abs. 

Kean  Err. 

Rel.  Err. 

X 

Data  Div. 

1429 

2229 

-0.55 

0.57 

4G.1 

Proc.  Div. 

1019 

1240 

-0.22 

0.24 

1.6 

Program 

2448 

2674 

-0.09 

0.14 

1.3 

Table  5:  Fall  tl9SlJ,  CIS  212 
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LAR  4B  ii:subjects  =  69) 


N 

A 

N 

h-k/n 

Mean  Err. 

Mean  ALs. 
Rel.  Err. 

X 

Data  Div. 

714 

1289 

-0.80 

0.84 

26  .4 

Proc.  Div. 

1139 

837 

+0.26 

0.24 

0.45 

Program 

1853 

1725 

+0.07 

0.13 

0.63 

LAP.  05  iirsubjects  - 

66 ) 

N 

A 

N 

H-K/N 

Mean  Abs. 

Mean  Err. 

Rel.  Err. 

A 

Data  Div. 

1061 

1721 

-0.62 

0.63 

40.6 

Proc.  Div. 

720 

990 

-0.37 

0.38 

1.42 

Program 

1731 

2150 

-0.21 

0.21 

1.17 

LAB  06  (^subjects  - 

53) 

N 

A 

N 

h-k/n 

Mean  Abs. 

Mean  Err. 

Rel.  Err. 

A 

Data  Div. 

1146 

1607 

-0.40 

0.46 

36.2 

Proc.  Div. 

653 

918 

-0.41 

0.42 

2.2 

Program 

1799 

1953 

-0.08 

0.12 

1.4 

Table  6 : 

Fall  11981],  CIS 

212 

53 


tl 

A 

i: 

k-n/e 

Mean  Abs. 
Rel.  Err. 

X 

Data  Div. 

730 

1155 

-0.58 

0.60 

28.36 

Proc.  Div. 

353 

538 

-0.52 

0.51 

2.73 

Program 

1083 

1398 

-0.29 

0.31 

1.94 

LAP.  3  vt'subjects  =  110) 


K 

A 

H 

K-K/K 

Mean  At>s. 
Rel.  Err. 

X 

Data  Div. 

1120 

1638 

-0.46 

0.50 

35.88 

Proc.  Div. 

616 

850 

-0.38 

0.40 

2.39 

Program 

1736 

1968 

-0.13 

0.17 

1.62 

LAB  4A  (ifsubjects  =  105) 


N 

A 

K 

n-n/n 

.Mean  Abs. 
Rel.  Err. 

X 

Data  Div. 

1431 

2173 

-0.52 

0.54 

48  .37 

Proc.  Div. 

1047 

1226 

-0.17 

0.20 

1.78 

Program 

2478 

2595 

-0.05 

0.11 

1.37 

:iai 
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IAB  4b  l ^subjects  =  105) 


tl 

A 

N 

k-l/m 

Mean  Abs. 
Re’.  Err. 

A 

Date  Div. 

735 

1272 

-0.73 

0.79 

25.7 

Proc.  Div. 

1151 

889 

+0.23 

0.25 

0.45 

Program 

1886 

1712 

+0.09 

0.15 

0.62 

LAB  05  l£subjects 

-  96 ) 

N 

A 

M 

n-k/k 

Mean  Abs. 

Rel.  Err. 

A 

Data  Div. 

1118 

17  52 

-0.57 

0.57 

42.7 

Proc.  Div. 

742 

1005 

-0.35 

0.37 

1.7 

Program 

1860 

2154 

-0.16 

0.17 

1.28 

LAB  06  (ifsubjects 

=  81) 

N 

A 

N 

N-$/N 

Mean  Abs. 

Re).  Frr. 

A 

Data  Div. 

1348 

1699 

-0.26 

0.29 

32 .7 

Proc.  Div. 

632 

872 

-0.38 

0.43 

2.29 

Pro gran 

2018 

-0.02 

0.08 

1.40 

Table  8: 

Winter  L19S2J, 

CIS  212 
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LAP.  2  vi:subiects  -  154) 


t; 

A 

II 

II— fl/  N 

Mean  Ads. 
Rel.  Err. 

X 

[■ 

Date  Div. 

7  20 

1158 

-0.61 

0.60 

29.14 

a 

Proc.  Div. 

356 

552 

-0.55 

0.54 

2.63 

Frograr 

1076 

1422 

-0.32 

0.32 

1 .89 

rr 

LAP>  3  lirsubjects  - 

128) 

N 

A 

II 

WKSMm 

Mean  Ads. 

Rel.  Err. 

X 

Data  Div. 

1111 

1649 

-U.43 

0.49 

36 .78 

Proc.  Div. 

602 

850 

-0.41 

0.43 

2.38 

Program 

1713 

1988 

-0.16 

0.16 

1 .63 

“  - 

LAP  4A  i vs nb 

iects 

=  1147 

m 

M 

A 

K 

K-N/M 

Mean  Abs. 

Rel.  Err. 

X 

J* 

1470 

2217 

-0.51 

0.51 

48.08 

Proc.  Civ. 

1049 

1245 

-0.18 

0.22 

1  .83 

s 

Program 

2519 

2633 

-0.05 

0.08 

1 .38 

Table  9:  Spring  11982 J , 

CIS  212 

* 

r*  k 


A 

I! 

N-f./K 

Mean  Abs. 
Re 3.  Err. 

1745 

-0.56 

0.55 

1014 

-0.56 

0.37 

2145 

-0.15 

0.1b 

H 

A 

N 

n-k/m 

Mean  Abs. 
Rel.  Err. 

X 

Data  Div. 

725 

1140 

-0.57 

0.58 

29.0 

Proc.  Div. 

329 

496 

-0.50 

0.51 

2.02 

Pro&ran 

1054 

1390 

-0.32 

0.32 

1.73 

OAF  3  v ^subjects  -67) 


A 

A 

. 

N 

K 

m-k/m 

Mean  Abs . 

Rel.  Err. 


X 
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i 


m 


a 
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.  * 

■I  ‘5 
$  ® 
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<1  -v 


S  $ 
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K  * 

•  *  'r*  i 


LAB  4B  (^subjects  =  109) 


H 


Data  Div. 
Proc.  Div. 
Program 


LAD  03  ti;suoiects  = 


n-n/h 


Mean  Abs. 
Rel.  Err. 


23.36 

0.37 

0.50 


*„  * 

N 

A 

N 

n-k/n 

Mean  Abs. 
Rel.  Err. 

X 

hg 

Data  Div. 

1058 

1653 

-0.56 

0.57 

37.8 

>>; 

Proc.  Div. 

721 

966 

-0.33 

0.34 

1.23 

Program 

1779 

207  5 

-0.16 

0.17 

1 .02 

LAB  06  t ^'subjects  =  28) 


N 

A 

N 

n-n/n 

Mean  Abs. 
Rel.  Err. 

Data  Div. 

13S4 

1744 

-0.26 

0.26 

Proc.  Div. 

615 

928 

-0.50 

0.53 

Program 

1999 

2125 

-0.06 

0.07 

Table  12: 

Fall  11982 J , 

CIS  212 

32.6 

2.17 

1.26 


The  sign  of  the  error  indicates  whether  N  is  an  overestimate  or 

A 

underestimate.  Note  that  N  is  consistently  an  overestimate  of  the  actual 
program  length  in  all  the  assignments  except  Lab  4B.  Also,  neither  the  data 
nor  the  procedure  division  above  yield  a  very  acceptable  length  estimate.  So, 
comparison  of  the  results  among  all  six  quarters  indicate  that  the  lenjth 
equation  works  well  only  when  the  data  division  is  combined  with  the  procedure 
division.  In  other  words,  the  best  estimate  of  the  program  length  is  attained 
when  the  entire  program  is  taken  into  consideration.  The  same  conclusion  was 
drawn  by  Sben  and  Dunsmore  in  their  software  science  analysis  of  COBOL 
programs  (6j  anc  was  suggested  by  Zweben  end  Fung  IliJ. 
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TO 


Software  science  postulates  that  the  language  level  '.A)  may  be  used  to 
compare  various  programming  languages.  If  A  is  indeed  a  property  of  the 
programing  language,  we  might  expect  that  it  is  approximately  a  constant  for 
all  programs  written  in  a  given  language.  However,  the  present  analysis 
indicates  that  the  language  level  is  not  constant.  This  finding  agrees  with 
that  of  Shen  and  Dunsmore  [61.  It  is  also  noticed  that  the  A  for  the  data 
division  is  always  very  high  compared  to  the  A  of  the  procedure  division  and 
that  of  the  program.  This  extremely  large  A  for  the  data  division  reflects 
the  fact  that  COBOL  provides  for  a  compact  representation  of  a  good  deal  of 
information  about  type,  size,  structure  and  initial  values  of  individual  and 
group  data  items. 

The  other  software  science  metrics  that  were  evaluated  for  each  program  are 
NOS  (number  of  statements),  L,  D  and  E.  Each  of  these  metrics  were  calculated 
separately  for  the  data  division,  procedure  division  and  for  the  entire 
program.  For  the  data  division,  NOS  denotes  the  number  of  periods,  but  for 
the  procedure  division  KOS  refers  to  the  total  number  of  COBOL  verbs  used. 
The  primary  reasons  for  calculating  these  metrics  for  each  program  are  to 
observe  how  these  metrics  differ  from  one  quarter  to  the  next,  and  to 
determine  if  the  change  of  D  or  E  from  one  assignment  to  the  next  does  really 
ref i ect  the  intuitive  relative  complexity  between  them. 

The  mean  values  of  these  metrics  for  each  program  of  CIS  212  in  six 
different  quarters  are  presented  in  the  following  tables.  The  number  of 
programs  (subjects)  analyzed  for  each  lab  in  a  given  quarter  is  the  same  as 
that  indicated  in  the  previous  set  of  tables. 

It  should  be  noted  that  the  values  of  the  software  science  metrics  obtained 
in  six  different  quarters  appear  to  be  very  consistent.  For  example,  R,  N, 

A  A 

Error,  A  ,  NOS,  L,  D  and  E  corresponding  to  each  assignment  are  observed  to  be 
compatible  from  one  quarter  to  the  next.  Note  that  there  is  a  significant 


Data  Div. 

114 

0.0821 

12.17 

56667 

Proc.  Div. 

71 

0.0386 

25.93 

73473 

Program 

185 

0.0175 

57.29 

484810 

0.0628 

0.0149 

0.0079 


15.92 

66.85 

125.91 


185558 

561890 

2699046 
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tiOS 

A 

L 

D- 1  /  L 

A 

E 

V 

Data  Div. 

143 

0.0589 

16.96 

109025 

Proc.  Div. 

190 

0.0066 

152.35 

1412113 

Program 

333 

0.0053 

171.88 

2902761 

LAB  5 

,V 

LOS 

A 

L 

D=i/L 

A 

E 

Data  Div. 

179 

0.0683 

14.65 

120101 

Proc.  Div. 

140 

0.0181 

55.22 

300835 

Program 

319 

0.0096 

103.28 

1575389 

V 

LAB  6 

NOS 

A 

L 

D-l/l. 

/V 

E 

m 

Data  Div. 

178 

0.0689 

14.50 

115863 

f. 

Proc.  Div. 

113 

0.0227 

44.05 

209213 

■ * 

•  - 

Program 

296 

0.0109 

91.74 

1297416 

i 

Table 

14:  Spring 

[1981 J,  CIS  212 
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LAB  2 


NOS 

A 

L 

D-l  /l. 

A 

V 

Li 

Data  Div. 

141 

0.0764 

13.03 

858^1 

Proc.  Div. 

77 

0.0553 

IS. 08 

35693 

Prog: am 

219 

0.0150 

66.50 

667409 

LAB  3 


NOS 

A 

L 

,/A 

D=1  /  L 

A 

E 

Data  Div. 

173 

0.0639 

14.31 

120941 

Proc.  Div. 

127 

0.0217 

46.06 

233156 

Program 

300 

0.0109 

92.12 

128630S 

LAE  4A 


NOS 

/A 

L 

d=i/l 

/A 

E 

Data  Div. 

254 

0.0659 

15.17 

187823 

Proc.  Div. 

213 

0.0160 

62.33 

523401 

Program 

467 

0.0082 

121.30 

265540O 

Data  Div. 

128 

0.0714 

14.0 

79007 

Proc.  Div. 

182 

0.0076 

130.7 

1243456 

Fro?*  am 

310 

0.0067 

148.6 

2356694 

iau  5 


NOS 

/V 

L 

D=1  /  L 

A 

E 

Data  Div. 

188 

0.0701 

14.2 

122194 

Proc.  Div. 

145 

0.0165 

60.4 

327872 

Program 

333 

0.0089 

111  .3 

1660602 

LAB  6 


NOS 

A 

L 

D-l  /l 

A 

E 

Data  Pi.v. 

200 

0.0649 

15.4 

147296 

Proc.  Div. 

126 

0.0219 

45.5 

230030 

Progr  am 

326 

0.0099 

100.5 

1590215 

Data  Div. 

130 

0.0742 

13.5 

77459 

Proc.  Div. 

71 

0.0363 

27.5 

68916 

Program 

201 

0.0154 

64.8 

559580 

LAB  3 


NOS 

L 

d=i/l 

A 

E 

Data  Div. 

192 

0.0648 

15.4 

142029 

Proc.  Div. 

126 

0.0236 

42.3 

196147 

Program 

318 

0.0108 

92.6 

1368461 

LAB  4A 


NOS 

A 

L 

d=i/l 

A 

E 

Data  Div. 

248 

0.0642 

15.5 

186835 

Proc.  Div. 

215 

0.0151 

66.3 

548483 

Program 

463 

0.0081 

123.4 

2677119 

LV 


131 

186 

317 


0.0703 

0.0075 

0.0065 


14.22 

132.29 

152.28 


86247 

1251474 

2462322 


Data  Div 
Proc.  Di 
Program 


LAB  5 


NOS 

A 

L 

D-l  /L 

A 

E 

Data  Div. 

196 

0.0701 

14.25 

129798 

Proc.  Div. 

148 

0.0177 

56.32 

317638 

Program 

344 

0.0092 

108.9 

1687651 

IAB  6 

nos 

A 

L 

D-l  /L 

E 

Data  Div. 

228 

0.0577 

17.32 

196688 

Proc.  Div. 

123 

0.0239 

41 .69 

220578 

Program 

351 

0.0098 

101.8 

1802300 

Data  Div. 

126 

0.0745 

13.40 

73208 

Proc.  Div. 

71 

0.0329 

29.45 

72472 

Program 

197 

0.0152 

65.84 

561437 

LAB  3 


NOS 

A 

L 

D-l  ll. 

A 

E 

Data  Div. 

191 

0.0655 

15.26 

135565 

Proc.  Div. 

123 

0.0236 

42.30 

189777 

Program 

314 

0.0108 

92.57 

1314060 

LAB  4A 


NOS 

A 

L 

D=l/L 

A 

E 

Data  Div. 

255 

0.0637 

15.70 

192030 

Proc.  Div. 

215 

0.0152 

65.78 

543205 

Program 

470 

0.0080 

124.51 

2702553 

Data  Div. 

126 

0.0708 

14.12 

80078 

Proc.  Div. 

189 

0.0071 

139.89 

1386760 

Program 

315 

0.0064 

156.95 

2547522 

J.AF  5 


nos 

A 

L 

D=l/L 

A 

E 

Data  Div. 

195 

0.0691 

14.47 

129545 

Proc.  Div. 

150 

0.0170 

58.84 

340214 

Program 

345 

0.0087 

113.95 

1778396 

LAB  6 


A  A  A 


NOS 

L 

d-i/l 

E 

Data  Div. 

224 

0.0553 

18.08 

192614 

Proc.  Div. 

118 

0.0236 

42.35 

204212 

Program 

342 

0.0092 

108.23 

1768135 

Data  Div. 
Proc.  Div. 
Progt  am 


0.06  59 
0.0069 
0.0059 


15.16 

144.0 

169.2 


92920 

1476480 

2900182 


Data  Div. 
Proc.  Div 
Pro si  am 


0.0679 

0.0153 

0.0084 


14.72 

65.12 

118.5 


126003 

353810 

1784601 


Data  Div. 
Proc.  Div. 
Program 


0.0549 

0.0225 

0.0083 


13.2 

44.36 

113.26 


200314 

219478 

1921595 


Table  24:  Fall  (1982],  CIS  212 

difference  in  the  values  of  the  software  science  metrics  for  Lab  3  in  the  Fail 
of  1982  as  compared  to  Lab  3  in  the  previous  quarters.  In  particular,  the  new 
Lab  3  has  higher  N,  N,  NOS,  D  and  E  values  than  those  for  the  old  Lab  3,  but 
comparable  to  those  of  Lab  4A  as  expected  due  to  the  change  made  in  this 
particular  assignment  in  the  Fall  of  1982. 


The  final  set  of  analyses  performed  or.  the  programs  of  CIS  212  is  the 

A 

calculation  of  the  relative  values  of  NOS,  D  and  E  for  all  the  assignments  to 
those  of  Lab  2.  Since  this  is  the  first  introductory  programing  course  in 
COBOL,  the  purpose  of  these  analyses  is  to  see  the  change  in  the  values  of 
these  metrics  with  respect  to  the  complexity  of  the  assignments  relative  to 
Lab  2.  The  explicit  values  of  these  analyses  are  shown  separately  for  each 
program  in  the  following  tables. 
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Lai 


nos 

Re 1. NOS 

D 

Rel.D 

A 

F 

Rel.E 

' 

to  Lab 

to  Lab 

to  Lab 

«*- 

2 

2 

2 

1 

ita 

178 

1.56 

14.50 

1.19 

115863 

2.04 

oc. 

118 

1.66 

44.05 

1.69 

209213 

2.85 

V 

og. 

296 

1.6 

91.74 

1.60 

1297416 

2.68 

11981 j,  CIS 

V 

Table  26: 

Spring 

212 

i 

f  **  * 


Data 

141 

1 

13.08 

1 

55831 

1 

Proc. 

77 

1 

18.08 

1 

85693 

1 

Prog. 

218 

1 

66.50 

1 

667409 

1 

LAB  3 


IIOS 

Re  1. NOS 
to  Lab 

2 

D 
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In  order  to  test  the  validity  of  the  software  science  metrics,  further 
analysis  was  performed  on  the  relative  difficulty  of  these  assignments.  For 
such  analysis,  the  coordinator  V supervisor)  of  this  course  was  asked  to  give 
the  approximate  relative  difficulties  of  each  assignment  with  respect  to  the 
others.  It  should  be  mentioned  that  the  coordinator  was  not  aware  of  the 
analyses  and  research  that  was  being  performed  on  these  assignments.  He  was 
asked  to  report  the  relative  ratings  of  the  assignments  to  help  assess  aspects 
of  the  course  curriculum.  The  relative  difficulties  of  the  assignments 
reported  by  the  coordinator  can  therefore  be  treated  as  an  independent  set  of 
data. 


M 


3 


wl 


In  addition  to  the  relative  difficulties  of  the  assignments,  the 
coordinator  also  provided  the  approximate  amount  of  time  (based  on  his 
experience  with  the  course  and  interaction  with  students  who  had  taken  tt:c- 


course)  that  a  student  spent  completing  the  Lab  4A.  Based  on  this  time  1  in 
hours)  of  Lab  4A  and  the  relative  ratings  of  all  the  assignments  (assuming  the 
difficulty  of  Lab  2  to  be  one  unit),  the  times  needec  to  complete  the  rest  of 
the  assignments  were  calculated.  The  results,  on  t'ne  basis  of  tne 
coordinator's  report,  are  shown  below. 


Assignment  It 

2 

3 

4A 

4B 

5 

6 

Relative  Difficulty 
(coordinator) 

1 

2 

5 

6.4 

3.6 

3 

Approximate  Effort 
-  Hours  (coordinator) 

7 

14 

35 

45 

25 

21 

The  timing  information  thus  obtained  for  each  assignment  was  compared  with 
the  estimated  time  calculated  from  the  software  science  effort  metric.  The 
range  of  the  relative  effort  (assuming  the  effort  for  Lab  2  to  be  unity)  for 
each  program,  based  on  the  values  found  in  six  different  quarters,  and  the 
corresponding  range  of  the  estimated  times  are  shown. 
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effort  metric.  The  Halstead  difficulty  metric  also  does  not  give  the  same 
relative  complexities  as  does  the  effort  metric.  It  is  interesting  to  observe 
that  the  relative  time  allotted  to  these  labs  is  inadequate  no  matter  which 
relative  measure  is  used.  Traditionally,  students  taking  this  course  seem  to 
face  a  great  deal  of  difficulties  in  completing  their  assignments  due  to  lacic 
of  time.  Therefore,  these  kinds  of  data  analyses  appear  to  be  potentially 
helpful  for  curriculum  improvements. 

The  analyses  performed  on  the  programs  of  CIS  212  were  also  done  for  the 
three  programs  of  CIS  313.  The  results  of  these  analyses  obtained  in  three 
different  quarters  are  shown  in  the  following  tables. 

The  software  science  metrics  found  for  Lab  1  and  Lab  2  in  three  different 
quarters  appear  to  be  quite  consistent.  However,  in  the  case  of  Lab  3,  the 
values  of  the  metrics  vary  slightly  from  one  quarter  to  the  next. 

It  is  observed  that  for  these  three  programs,  the  length  equation  works 
equally  well  both  for  the  entire  program  as  well  as  for  the  procedure  division 
alone.  Also,  note  that  for  Lab  2  and  Lab  3  (the  larger  labs!  when  the  entire 

A 

program  is  considered,  the  1!  is  an  underestimate  of  the  actual  length.  But 

A 

for  Lab  1,  N  is  consistently  an  overestimate  of  the  actual  program  length.  In 
addition,  the  data  division  analysis  for  each  program  shoi/s  a  very  large 
A  value,  as  was  observed  in  connection  with  the  programs  of  CIS  212.  All 
other  metrics,  e.g.,  NOS,  D  and  E,  for  each  assignment  are  reasonably 
consistent  in  every  quarter. 

2.3  Analysis  of  University  System  Computer  Center  Programs 

The  analyses  of  the  students'  programs  collected  from  the  introductory 
COBOL  courses  at  the  Ohio  State  University  were  shown  in  the  previous  section. 
The  purpose  of  this  section  is  to  study  the  behavior  of  the  software  science 
metrics  for  programs  written  in  an  environment  which  is  different  from  the 
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Table  45:  CIS  313,  Fall  1932 

students'  environment.  In  order  to  observe  such  behavior,  ten  COBOL  programs 
of  various  sizes  were  obtained  from  the  University  Systems  Computer  Center  at 
the  Ohio  State  University.  These  ten  production  programs,  written  by 
professional  programmers,  were  much  larger  in  size  and  perform  different  kinds 
of  functions  than  the  students'  programs  considered  earlier. 

Each  of  the  ten  programs  was  run  through  the  software  science  analyzer  at 
the  Ohio  State  University  to  calculate  the  software  science  statistics.  The 
results  of  the  analyses  of  these  programs  follow. 

The  analyses  of  the  University  Systems  programs  shov;  that  for  some  programs 
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Table  65:  Program- ID:  KEW-VX-TASK  75  [File  24 J 


the  length  equation  works  better  for  the  procedure  division  alone,  and  for 
others  the  length  equation  gives  a  better  estimate  when  the  entire  program  is 
considered  (i.e.,  when  the  data  division  is  combined  with  the  procedure 
division).  It  should  be  noted,  however,  that  for  almost  all  of  these  programs 
both  the  procedure  division  and  the  entire  program  give  reasonable  values  of 
the  error.  A  similar  conclusion  was  drawn  by  Shen  and  Dunsmore  from  the 
analysis  of  their  COBOL  analyzer  program  itself  1 6  J .  Since  the  data  division 
is  a  significant  part  of  any  COBOL  program  and  may  require  a  considerable 
amount  of  programming  effort,  it  still  seems  reasonable  to  include  it  in 
software  science  studies. 

For  each  of  these  production  programs,  the  A  value  for  the  data  division  is 
observed  to  be  much  higher  than  that  for  the  procedure  division  or  the  whole 
program,  as  was  observed  in  connection  with  the  analyses  of  the  students' 
programs.  It  is  also  interesting  to  note  that  the  values  of  A  for  these 
programs  are  generally  much  lower  than  the  A  values  for  the  student  programs. 
Shen  and  Dunsmore  observed  that  A  seems  to  fall  as  the  program  size  increases. 
However,  the  two  largest  programs  in  this  sample  have  the  largest  values  of  A  ! 
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Additional  metrics,  e.g.,  MOS,  L,  D  and  E  have  also  been  evaluated  for  each 
program.  It  is  interesting  to  note  that  for  this  particular  set  of  programs, 

/v 

the  MOS  and  E  metrics  order  the  programs  in  the  sane  way.  However,  we  were 
unable  to  obtain  data  from  University  Systems  whi ch  would  allow  us  to  validate 
E  as  an  estimate  of  actual  development  effort. 


2.4  Comparison  of  the  Results  Between  the  OSU  Analyzer  and  the  Purdue  Analyzer 

In  order  to  find  out  the  differences  between  the  software  science  metrics 
values  when  using  two  different  counting  strategies,  the  programs  collected 
from  University  Systems  were  run  through  two  different  COBOL  analyzers.  One 
COBOL  analyzer  was  developed  at  OSU,  and  the  other  analyzer  was  produced  by 
the  software  metrics  research  group  at  Purdue  University.  The  metrics  values 
produced  by  the  two  analyzers  are  shown  in  the  following  tables.  Since  it  was 
observed  that  the  best  software  science  estimates  are  generally  achieved  for 
the  entire  program  rather  than  for  the  data  or  procedure  division  alone,  this 
section  includes  the  results  only  of  the  analysis  of  the  entire  program  ti.e., 
combination  of  data  and  procedure  division).  The  differences  noticed  in  the 
values  of  the  metrics  are  due  to  the  differences  in  counting  strategies  of 
operators  and  operands  as  proposed  by  the  two  groups  12 j.  It  appears  that  the 
Purdue  analyzer  was  unable  to  analyze  the  largest  program  (File  24;  among  all 
these  ten  programs  obtained  from  the  University  Systems. 

The  results  of  both  the  analyzers  on  the  University  System  production 
programs  show  that  the  length  equation  works  well  for  almost  all  of  the 
programs  in  the  set.  When  the  OSU  analyzer  is  used,  it  is  noticed  that  the 
length  equation  produces  positive  error  for  the  two  smallest  and  the  largest 
program.  For  all  other  programs,  the  length  equation  produces  negative 
errors.  On  the  other  hand,  the  use  of  Purdue's  analyzer  shows  that  the  length 
equation  produces  negative  errors  for  all  the  programs  in  the  set  except  the 
smallest  program.  These  results,  as  obtained  for  this  set  of  ten  production 
programs,  contradicts  the  result  observed  for  11  AIRMICS  production  programs 
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FROG. 

f 

PURDUE 

OSU 

ETAl 

Ml 

ETA2 

N2 

ETAl 

N1 

ETA2 

N2 

File  15 

101 

1486 

329 

1368 

123 

2286 

540 

1904 

File  16 

105 

1633 

360 

1562 

127 

2531 

579 

2113 

File  17 

87 

1324 

268 

1212 

98 

1970 

431 

1804 

File  18 

88 

1364 

276 

1250 

99 

2026 

439 

1845 

File  19 

93 

1303 

200 

1028 

102 

1958 

288 

1470 

File  20 

87 

1198 

208 

962 

96 

1800 

303 

1404 

File  21 

87 

1118 

276 

1281 

X03 

2073 

399 

1552 

File  22 

91 

1323 

326 

1532 

107 

2442 

463 

1900 

File  23 

89 

2385 

499 

2243 

100 

3982 

693 

2933 

File  24 

- 

- 

- 

113 

5259 

806 

3818 

PROG. 

# 

PURDUE  OSU 

N 

N 

N-N 

N 

X 

N 

N 

■ 

File  15 

2854 

3423 

-0.20 

0.55 

4190 

5755 

-0.37 

0.83 

File  16 

3195 

3761 

-0.18 

0.55 

4644 

6201 

-0.34 

0.81 

File  17 

2536 

2722 

-0.07 

0.56 

3774 

4420 

-0.17 

0.78 

File  18 

2614 

2806 

-0.07 

0.55 

3871 

4509 

-0.16 

0.80 

File  19 

2331 

2137 

+0.08 

0.34 

3428 

3033 

+0.11 

0.42 

File  20 

2160 

2162 

-0.001 

0.44 

3204 

3129 

+0.02 

0.53 

File  21 

2399 

2798 

-0.17 

0.50 

3625 

4136 

-0.14 

0.78 

File  22 

2855 

3313 

-0.16 

0.53 

4342 

4821 

-0.11 

0.80 

File  23 

4628 

5048 

-0.09 

1.06 

6915 

7204 

-0.04 

1.47 

File  24 

- 

- 

- 

- 

9077 

8552 

+0.05 

1.22 

PROG.  # 

PURDUE 

osu 

E 

Estimated 

Time  in  Hrs. 

E 

Estimated 

Time  in  Hrs. 

File  15 

5201458 

80 

8537391 

131 

File  16 

6434318 

99 

10220465 

158 

File  17 

4212549 

65 

7113125 

110 

File  18 

4447800 

68 

7315625 

113 

File  19 

4548095 

70 

7764473 

120 

File  20 

3578873 

55 

6291590 

97 

File  21 

4121212 

63 

6636938 

102 

File  22 

5287021 

82 

8833333 

136 

File  23 

8532264 

132 

14170000 

219 

File  24 


24149459 


373 
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reported  by  Shen  and  Dunsmore  l6j,  nanely  that  the  length  equation  produces 
negative  errors  for  small  programs  but  positive  errors  for  large  programs. 

It  was  also  found  t6 ,  7j  that  the  range  of  program  sizes  for  which  the 
length  equation  appears  to  work  best  is  2000  S  N  <  4000.  It  is  of  interest  to 
note  that  the  program  length  prediction  for  all  the  programs  in  that  range  are 
quite  satisfactory. 

Another  result  reported  [6]  for  A1RMICS  production  programs  is  that 
language  level  (A)  is  affected  by  then  size  of  the  program.  In  particular, 
large  N's  are  accompanied  by  smaller  A's.  Hovever,  the  A  values  obtained  for 
these  10  production  programs,  using  two  different  analyzers,  do  not  seem  to 
support  this  particular  result. 

In  summary,  the  COBOL  studies  provided  mixed  results.  On  the  positive 
side,  the  length  estimate  was  once  again  found  generally  satisfactory.  The 
effort  measure  also  provided  some  more  evidence  that  it  can  be  used  to 
approximate  development  time,  and  can  reliably  estimate  relative  effort  of 
development,  at  least  for  Ismail)  student  programs.  Use  of  this  information 
to  assist  in  curriculum  control  was  also  suggested. 

On  the  negative  side,  further  evidence  against  the  utility  of  the  language 
level  measure  was  obtained.  Large  variances  were  observed,  consistent  with 
other  studies,  and  conflicting  evidence  of  the  relationship  between  A  and  N  to 
that  of  Shen  and  Dunsmore  was  obtained.  Contrary  results  to  those  of  previous 
authors  concerning  the  sign  of  the  relative  error  in  the  length  estimate  was 
also  obtained.  The  counting  strategies  for  the  OSU  and  Purdue  analyzers 
appears  sufficiently  different  that  the  actual  values  of  several  of  the 
metrics  changes  dramatically.  This  has  very  serious  implications  if  these 


metrics  are  to  be  used  in  an  absolute  sense,  say  as  estimates  of  development 


tine.  It  once  again  points  out  the  need  for  taking  great  care  in  interpreting 


the  results  of  software  science  studies,  and  in  comparing  these  results  with 


those  of  other  researchers. 


3.  Relationships  Among  Various  Software  Metrics 


3.1  Motivation 

Software  complexity  metrics  appear  to  have  numerous  advantages  in  the 
design,  construction  and  maintenance  of  software  systems.  While  several  such 
metrics  have  been  defined,  and  some  of  them  have  been  validated  on  actual 
systems,  significant  work  remains  to  be  done  to  establish  the  relationships 
among  these  metrics.  This  chapter  shows  the  relationships  among  four 
different  complexity  metrics,  which  were  calculated  for  a  small  set  of  COBOL 
programs.  The  primary  motivation  of  this  study  is  to  investigate  the  extent 
to  which  each  of  the  four  complexity  metrics  correctly  orders  the  programs  by 
their  actual  programming  time,  the  hypothesis  being  that  a  more  complex 

program  takes  longer  to  write.  The  metrics  considered  in  this  study  were 

Halstead's  software  science  effort  measure  l3j,  McCabe's  cyclomatic  complexity 
metric  l5j,  Henry  and  Kafura's  information  flow  complexity  metric  1 4 j ,  and 

Davis's  chunk  model  complexity  metric  [lJ.  Since  the  information  flow 
complexity  is  primarily  used  to  define  the  complexity  of  an  individual 
procedure  rather  than  the  complexity  of  the  entire  program,  a  section  has  been 
included  to  find  the  relationship  between  the  Halstead's  effort  and  the 

information  flow  complexity  for  each  module  using  the  same  set  of  COBOL 
programs . 

3.2  Background  and  Definitions  of  Metrics 

The  definitions  of  each  complexity  metric  considered  in  this  study  are 
given  in  this  section. 

The  Halstead's  Effort  (E)  is  computed  using  the  equation  12.11),  as  defined 


in  Chapter  2. 


The  cyclomatic  number,  V(G) ,  of  a  graph  with  a  vertices,  e  edges,  and  p 
connected  components  is  £ 5 J 

V(G)  =  e-n+p. 


McCabe  assumes  that  ev.  ry  program  can  be  represented  by  a  directed  graph 
where  the  edges  represent  different  control  paths  and  the  nodes  represent 
processing  segments.  The  number  of  components,  p,  can  be  identified  with  the 
number  of  different  routines  in  a  program.  It  can  be  shown  that  the 
cyclomatic  complexity  for  a  program  is  a  function  of  the  number  of  predicates 
in  the  program.  Formally,  if  G  is  a  program  containing  M  binary  decision 
points  le.g.,  IF,  WHILE,  FOF.)  then  the  cyclomatic  complexity,  V(G) ,  is 

VCG)  »  M+l. 


McCabe  has  shown  that  this  value  of  V(G)  denotes  the  cardinality  of  a  basis 
set  of  paths  through  a  program. 

Henry  and  Kafura's  Information  Flow  Complexity  Metric  (IFC) 


The  information  flow  complexity  metric  lAl  deals  directly  with  system 
connectivity  by  observing  the  flow  of  information  or  control  among  system 
components.  In  this  case,  the  formula  for  defining  the  complexity  value  of  a 
procedure  is 

IFC  -  Leugth  *  i fan-in  *  fan-out)^, 


where 

Length  of  a  procedure  is  defined  as  the  number  of  lines  of  text  in 
the  source  code  for  the  procedure. 

Fan- in  of  a  procedure  A  is  the  number  of  local  flows  into 

procedure  A  plus  the  number  of  data  structures  from  which  the  procedure 

A  retrieves  information. 
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The  term  "local  flow"  is  defined  as  follows: 

There  is  a  local  flow  of  information  from  module  A  to  module  B  if  one 
more  of  the  following  conditions  holds. 

1.  If  A  calls  B, 

2.  If  B  calls  A  and  A  returns  a  value  to  B,  which  B  subsequently 
utilizes,  or 

3.  If  e  calls  A  and  B  passing  an  output  value  from  A  to  B. 

For  example,  consider  the  following  structure  chart  for  a  COBOL  program.: 
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Fan-in  of  B  -  ( £  local  flow  into  B)  +  (#  DS  from  which  B  retrieves 
information) 

=  2  +  3  =  5 

since,  there  is  a  local  flov;  from  A  to  B  (by  definition  1)  and  also 
from  C  to  B  (by  definition  2).  The  data  elements  from  v/hich  B 
retrieves  information  are  P,  Q  and  Y. 

Fan-out  of  B  =  (v  local  flov;  from  B)  +  (£  DS  which  B  updates) 

=  3+3=6 

since,  from  B  there  is  local  flow  to  C,  D  and  E,  and  the  data  elements 
updated  by  B  are  X,  Z  and  Q. 


The  details  of  the  counting,  strategy  used  to  find  information  flov/ 
complexity  for  COBOL  programs  are  explicitly  listed  in  Appendix  A. 

Chunk  Model  Complexity  (C) 

In  this  approach,  chunks  are  used  as  a  basis  of  complexity  measurement. 
The  original  idea  is  based  on  the  fact  that  an  expert  programmer  does  not 
understand  a  program  on  a  character  by  character  or  line  by  line  basis. 
Rather  programmers  assimilate  groups  of  statements  v/hich  have  a  common 
function.  These  groups  are  called  "chunks".  Therefore,  the  idea  is  to 
consider  a  program  as  divided  into  more  than  one  chunks  based  on  some  definite 
criteria  (which  vary  from  language  to  language).  The  complexity  of  each  chunk 
is  determined,  and  these  can  then  be  added  up  to  calculate  the  complexity  of 
the  program.  For  example,  in  a  COBOL  program  each  performed  paragraph  can  be 
treated  as  a  chunk. 


The  final  formula  for  program  complexity  can  then  be  written  in  the  form 
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ft 


ft  chunks 
C  =  z 

1=1 

ft  chunks 
=  I 

i=l 


where 

-  Complexity  of  the  ith  chunk  (e.g.,  lines  of  code) 
f^  -  fan-in  for  the  ith  chunk 
R  =  2/3,  Review  Constant. 

It  should  be  noted  that  in  this  case  the  definition  of  fan-in  is  different 
from  the  fan-in  used  in  connection  with  information  flow  complexity  of  the 
previous  section.  In  particular,  here  the  term  "fan-in"  accounts  for  the 
number  of  other  chunks  affected  by  a  particular  chunk.  Formally,  chunk  A  is 
affected  by  chunk  B,  denoted  by  A  ->  B,  if  any  one  of  the  following  conditions 
is  true. 

1.  Chunk  A  has  a  control  connection  to  B :  A  =>  cB, 

2.  Chunk  A  has  a  data  connection  to  B:  A  =>  dB. 

Now,  A  has  control  connection  to  B,  if  A  contains  a  PERFORM  or  GO  TO 
statement  which  references  B.  On  the  other  hand,  A  has  data  connection  to  B  if 
there  is  some  variable  X  whose  value  is  changed  in  B  and  referenced  in  A.  As 
an  example,  consider  the  same  structure  chart  as  before: 


.v 


£ 


u 


r  this  case 
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Fan-in  of  E  -  (#  control  connection  to  B)  + 

(v  data  connection  to  B) 

=  1  +  1,  (A  =>  cB,  A  =>  dB),  D  ->  dB 

-  2. 

Note  that,  although  A  has  both  control  and  data  connection  to  B,  it  is 
counted  only  once  in  calculating  the  fan-in  of  B.  Furthermore,  it  should  be 
mentioned  that  the  fan-in  of  chunk  A  will  be  entirely  determined  by  the  number 
of  data  connections  to  A,  since  there  is  no  control  connection  to  A. 

3.3  Source  of  Data  and  the  Comparison  of  the  Metrics 

The  COBOL  programs  considered  in  this  study  were  written  by  the  students  of 
an  undergraduate  course  on  "introduction  to  file  processing"  at  the  Ohio  State 
University.  Two  different  sets  of  programs  were  analyzed.  The  first  set  of 
programs  deals  with  input  data  validation  (i.e.,  the  program  edits 
transactions  for  a  master  file  update).  The  other  set  of  programs  is 
concerned  with  product  master  file  update;  that  is,  the  program  updates  a 
product  master  file  by  making  changes  to  several  fields,  e.g.,  product 
description  and  prices,  adding  new  products  and  deleting  old  products.  It 
should  be  mentioned  that  the  students  in  this  course  are  fairly  familiar  with 
COBOL  syntax,  because  this  is  their  second  course  using  COBOL.  Each  student, 
while  visiting  a  program,  is  also  required  to  keep  track  of  the  program 
development  history  using  a  shot  log.  This  shot  log  provides  detailed 
information  about  actual  programming  time  needed  in  various  activities  le.g., 
time  in  designing,  in  coding  and  in  modifying  at  each  subsequent  run,  etc.). 
About  20  students  took  part  in  the  study.  However,  after  all  the  shot  logs 
were  obtained,  only  three  subjects  corresponding  to  extreme  situations  were 
selected.  In  other  words,  for  each  type  of  program,  the  study  was  made  using 
only  a  sample  subject  which  showed  small,  medium  and  very  high  development 
times  respectively.  For  each  of  the  programs  considered  in  the  analysis,  all 
four  complexity  metrics  were  calculated. 


In  order  to  find  Halstead  s  effort,  each  program  was  run  through  the 
Software  Science  Analyzer  12]  developed  by  the  Software  Metrics  Research  group 
at  the  Ohio  State  University.  All  other  complexity  metrics  were  evaluated 
manually  using  the  source  code  and  the  hierarchy  chart  for  each  program.  The 
detailed  derivations  of  these  results  are  included  in  Appendix  A. 


The  results  of  calculating  the  four  different  metrics  for  two  sets  of  COBOL 
programs  are  summarized  below: 


Program  ID 

Reported 
Tine  in 
Hrs. 

Halstead 

Effort 

(E) 

V(C) 

Information 
Flow  Com¬ 
plexity 
(IFC) 

Chunk  Model 

Complexity 

(C) 

CIS  313-L2- 
TC0650 

15 

1104705 

36 

5051929 

314 

CIS  313-L2- 
TC0671 

10.5 

1241195 

41 

8661844 

295 

CIS  313-L2- 
TC0645 

27 

1267142 

43 

4402984 

362 

CIS  313-L3 
TC0645 

31 

6895476 

72 

25709599 

1031 

CIS  313-13- 
TC067I 

40 

5503529 

72 

23343021 

860 

CIS  313-13- 
TC0622 

49 

6938627 

71 

36751009 

1309 

It  is  observed  that  for  the  first  set  of  programs,  the  chunk  model 


complexity  shows  very  good  agreement  with  the  reported  time  in  the  sense  that 


it  orders  the  three  programs  correctly.  On  the  other  hand,  for  this  set  the 
information  flow  complexity  seems  to  be  the  worst  predictor  of  time.  Finally, 
there  is  a  good  agreement  between  the  complexity  values  of  Halstead  E  and 
V(G).  For  the  second  set  of  programs,  E,  information  flow,  and  the  chunk 
model  metrics  order  the  programs  identically.  However,  none  of  these  metrics 
orders  the  three  programs  properly  with  respect  to  reported  time  by  the 
programmers . 

3.4  Module  Based  Comparison  Between  Effort  and  Information  Flow  Complexity 

It  should  be  noted  that  the  complexity  of  a  procedure  depends  on  two 
factors,  namely,  the  complexity  of  the  procedure  code  and  the  complexity  of 
the  procedure's  connections  to  its  environments.  Halstead  and  McCabe's 
complexity  measures  appear  to  use  only  the  first  factor.  However,  the 
information  flow  complexity  measure,  while  using  both  the  factors  to  some 
degree,  concentrates  primarily  on  the  procedure's  connections  to  its 
environment  through  the  fan-in  and  fan-out.  Since  both  the  Halstead  measure 
and  the  information  flow  complexity  measure  fail  to  give  uniform  weight  to 
both  the  factors  mentioned  above,  an  attempt  was  made  to  find  out  how  these 
two  measures  differ  with  respect  to  individual  modules  (performed  paragraphs) 
for  the  same  sets  of  COBOL  programs  used  in  the  previous  section.  In  other 
words,  the  purpose  is  to  see  if  both  the  measures  behave  the  same  way,  or  to 
find  out  how  they  differ.  The  detailed  results  of  the  module-based  comparison 
for  each  sample  are  presented  in  this  section.  Mote  that  E  for  each  module 
was  obtained  by  running  each  procedure  division  paragraph  (along  with  its 
associated  data  division  entries)  separately  through  the  Software  Science 
Analyzer  ( 2 1 . 

Consider  first  the  program  CIS  313-L2-TC0645 .  The  hierarchy  chart  for  this 
program  is  shov/n  below,  where  each  box  denotes  a  module  ir.  the  program.  The 
asterisk  (*)  in  module  2  indicates  that  it  is  an  iterated  part  of  module  1. 
It  is  observed  that  for  most  of  the  modules  there  is  very  high  correlation 
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between  the  complexity  values  in  the  two  measures.  For  example,  module  2 
containing  the  only  major  loop  in  the  program  shows  highest  complexity  in  both 
the  measures.  Modules  6  and  7  seem  to  be  equally  complex,  as  expected  from 
the  code  and  design  shown  below.  Finally,  complexity  values  in  both  the 
measures  show  that  modules  11,  12  and  13  are  equall)  complex,  as  expected. 


Module  v 


Halstead,  £ 


Now  consider  the  program  CIS  313-L2-TC0671 .  The  hierarchy  chart  for  this 


♦MODULE  6 


TYPE-1-4. 

♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦tt*********************** 

*  THIS  PARAGRAPH  CHECKS  TO  MAKE  SURE  BOTH  THE  UPDATE 

*  PRICE  AND  THE  DESCRIPTION  ARE  NON-BLANK t  IF  NOT 

*  A  i  LAG  IS  SET. 

♦*#♦#♦♦♦♦♦♦♦♦♦♦♦#♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦ 
IF  PR-PRICE  IS  NOT  NUMERIC  OR 

PR -DESCRIPT  I  ON  OF  PUF-REC  IS  =  '  ' 

MOVE  1  TO  FLAG! 

MOVE  1  TO  FL.AG7 

MOUE  PR-PRICE  TO  PRICE. 

IF  PR-PRICE  IS  NUMERIC 

MOUE  PR-PRICE1  TO  PRICE1. 


♦MODULE  7 
TYPE-2-6-9. 

&  &&  &&  &&  & 

^  ^  *T'  'T*  *p  rp  «p  If*  /p  ^p  ip  ip  ip  ip  ip  ip  ip  ip  ip  ip  ip  ip  ^p  ip  ip  ip  ip  ip  ip  ip  ip  ip  *p  •  p  ip  H*  ip  ip  ip  ip  ip  ip  ip  ip  ip  ip  If.  ip  ip  If ■  ip  Ip  ip 

♦  FOR  ACTIVATE*  DEACTIVATE  AND  DELETE  BOTH  THE  UPDATE 

♦  PRICE  AND  DESCRIPTION  MUST  BE  BLANK »  THIS  PARAGRAPH 

♦  SETS  A  FLAG  IF  THEY  ARE  NOT, 

Ji  *1/  *Af  Ur  ^  Ur  <ir  Ur  *A/  ,v  Ur  Ur  'V  ‘ir  ^  UL  Ur  'V  'V  Ur  ^  'Jr  \1/  ^  Ur  ^  Ui  vii  -2/  -As  -i#  \U  ■(/  *ii  \ii  Ur  Ur  nU  V  *-V  XV  U'  'J/ 

if(  ^  y  ^  if  ip  Y  ^  Y  ^  ^  rp  ip  ^  if  ^  ip  <p  ^  if  ip  ip  ip  *|(  /p  ^  if  if  ip  *p  ^  ip  if.  ip  ip  ip  *p  ip  ip  f  ip  ip  ip  if.  ip  i{\  ip  ip  ip  ip  ip  ip  ip  *A  ip 

IF  PR-PRICE  IS  NOT  »  #  '  OR 

PR-DESCRIPTION  OF  PUF-REC  NOT  =  '  ' 

MOVE  1  TO  FLAG! 

MOVE  1  TO  FLAGIO 
IF  PR-PRICE  IS  NUMERIC 

MOVE  PR-PRICE1  TO  PRICE1 , 

MOVE  PR-PRICE  TO  PRICE. 
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♦MODULE  It 
HEADER -PR t . 

(tX****************************************  >H*S  **.!•  ***** 

♦  THIS  13  THE  PARAGRAPH  TO  PRINT  A  HEADIUC  Oil  "1C  TO f 

*  OF  THE  PACE  FOR  ALL  THE  UPDATE  i  L  Ol it  . . 
*******************  *****  *******  ******  ******  »  it  ******  ** 

URITE  PRIHTCR-REC1  FROM  T -HDG 

AFTER  ADVANCING  fO-TOP  -OF -PAGE. 

MOVE  ALL  UPDATE  REQUESTS'  TO  TITLE-PHRASE. 

ADD  1  TO  PAGE-CT 1 . 

MOVE  PAGE- CT1  TO  PAGE-OUT. 

URITE  PRINTER-REC1  FROM  T-HDG-1 
AFTER  ADVANCING  2  LINES. 

URITE  PRINTER  -RECt  FROM  HDG- 1 
AFTER  ADVANCING  2  LINES. 

URITE  PRIMTER-REC1  FROM  HDG— 2 
hFTER  ADVANCING  1  LINES. 

MOVE  6  TO  LINE-CT1. 


♦MODULE  12 
HEADER-PR2 , 

****************************************************:*** 

*  THIS  PARAGRAPH  PRINTS  A  HEADING  ON  THE  TOP  OF  THE  PAGE 

*  FOR  THE  REPORT  OF  ALL  THE  GOOD  UPDATE  REQUESTS . 
****************************************** ***** ******** 

URITE  PRINTER-REC2  FROM  T -HDG 

AFTER  ADVANCING  TO-TQF-OF-PAGE , 

MOVE  GOOD  UPDATE  REQUESTS'  TO  TITLE-PHRASE. 

ADD  1  TO  PAGE-CT2. 

MOVE  PAGE-CT2  TO  PAGE-OUT. 

URITE  PRINTER-REC2  FROM  T-HDG-1 
AFTER  ADVANCING  2  LINES. 

URITE  PRINTER-REC2  FROM  HDG- 1 
AFTER  ADVANCING  2  LINES. 

URITE  PRINTER-REC2  FROM  HDG-2 
AFTER  ADVANCING  1  LINES. 

MOVE  6  TO  LIME-CT2 . 


♦MODULE  13 
HEADER-PRO . 

******************************************************* 

*  THIS  IS  THE  PARAGRAPH  TO  PRINT  A  HEADING  ON  THE  TOP 

*  OF  THE  PAGE  FOR  ALL  THE  INVALID  UPDATE  REQUESTS. 
********** **X******************************************* 

URITE  PRINTER-REC3  FROM  T-HDG 

AFTER  ADVANCING  TO-TOF-OF-PAGE . 

MOVE  'INVALID  UPDATE  REQUESTS'  TO  TITLE -PHRASE . 

ADD  1  TO  PAGE-CT3 . 

MOVE  PAGE-CT3  TO  PAGE-OUT. 

URITE  PRINTER-REC3  FROM  T-HDG-1 
AFTER  ADVANCING  2  LINES. 

URITE  PRINTER-REC3  FROM  HDG-1 
AFTER  ADVANCING  2  LINES. 

URITE  PR INTER-REC3  FROM  HDG-2 
AFTER  ADVANCING  1  LINES- 
MOVE  6  TO  LINE-CT3. 


•UjLjL/. 
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complexity  metric  for  each  module  are  listed  in  the  following  table 


Module  i‘ 


Halstead  E 


114327 


7973176 


112896 

348160 


Table  67:  Program  ID:  CIS  313-L2-TC0671 


Note  that,  for  most  of  the  modules,  the  complexity  measures  in  the  two 
cases  are  comparable.  For  example,  module  2  is  the  oniy  major  loop  in  the 
program,  and  the  complexity  value  of  this  module  is  highest  for  both  the 
measures  as  might  be  expected.  Modules  4  and  7  are  very  small  and  perform 
very  similar  functions  (in  fact,  both  the  modules  contain  only  one  IF 
statement  and  a  MOVE  statement).  The  complexity  values  of  these  two  modules 
are  very  close  for  both  measures.  However,  the  difference  in  the  complexity 
values  between  nodules  12  and  13  is  noticeable.  In  particular,  the  E  measures 
of  modules  12  and  13  are  exactly  the  same,  but  the  information  flow  complexity 
suggests  that  the  complexity  of  module  13  is  somewhat  higher  than  that  of 
module  12.  Examining  the  actual  code  (as  shown  below),  it  can  be  seen  that 
both  of  these  modules  have  virtually  identical  functions  and  contain  the  sane 
number  of  lines  of  code.  So,  intuitively,  the  complexity  should  be  about  the 


The  reason  for  the  variation  of  information  flow  complexity  values  between 
modules  12  and  13  is  due  to  the  higher  fan-in  value  of  module  13.  The 
inconsistency  in  the  two  complexity  metrics  suggests  that  the  two  metrics 
serve  different  purposes.  In  particular, 


♦MODULE  12 


♦a*#*##*####*##*##**#**#####**#*#***#***#*** 

♦THE  NEW-HEADER-2  IS  USED  TO  PRINT  THE  NEW  ♦ 

♦HEADER  AT  THE  TOP  OF  A  NEW  PAGE  FOR  * 

♦PRINTER-FILE-2 ,  ♦ 

♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦ft** 
NEW-HEADER-2. 

IF  TOTAL-LINES- VALID  NOT  LESS  THAN  MAX-LINES 
MOVE  PAGE  -  C  0  U  N  T  -  V  A  L I D  TO  UAL  ID -IJF' DATE -PAGE 
WRITE'PRINTE R - PECO R D - 2  F R 0 M  U A L I D - U P DATE - H E A D E R 
A F TER  AD U A N C I N G  T 0 - T 0 P ••••  0 F - P A G E 
WRIT  E  P R I N T  E R - R  E C 0 R D - 2  F R 0 M  0 A T E - H  E A  D E R 
AFTER  ADVANCING  i.  LINES 
U R I T E  P R I NT E R - R E C 0 R D - 2  F R 0 M  T 0 P P E R - 1 
AFTER  ADVANCING  3  LINES 
WRITE  PRINTER-RECORD-2  FROM  TOPPER-2 
AFTER  ADVANCING  .1  LINES 
MOVE  6  TO  TOTAL -LINES-VAL ID 
ADD  1  TO  PAGE-COUNT-VALID » 


♦MODULE  13 

♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦ 

♦THE  NEW-HEADER- 3  IS  USED  TO  PRINT  THE  NEW  ♦ 

♦HEADER  AT  THE  TOP  OF  A  NEI  PAGE  FOR  ♦ 

♦PRINTER-FILE-3.  ♦ 

♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦ 

NEW -HEADER-3. 

IE  TOTAL-LINES- INVALID  NOT  LESS  THAN  MAX-LINES 

MOVE  PAGE-COUNT- I NVAL ID  TO  INVALID-UPDATE -PAGE 
WRITE  PRINTER-RECORD-3  FROM  INVALID-UPDATE-HEADER 
AFTER  ADVANCING  TO-TOP-OF-PAGE 
WRITE  PRINTER -RECORD -3  FROM  DATE-HEADER 
AFTER  ADVANCING  1  LINES 
WRITE  PRINTER-RECORD-3  FROM  TOPPER-1 
AFTER  ADVA, ICING  3  LINES 
WRITE  PRINTER-RECORD-3  FROM  TOPPER-2 
AFTER  ADVANCING  1  LINES 
MOVE  6  TO  TOTAL -LINES- I NVAL ID 
ADD  1  TO  PAGE-COUNT-INVALID. 


1.  Effort  may  be  a  development-oriented  metric  once  the  specification 
is  given  rather  than  a  maintenance-oriented  metric.  In  contrast, 
the  information  flow  complexity  may  be  a  better  indicator  of 
modifiability,  since  information  flow  complexity  strongly 
emphasizes  the  procedure's  connection  to  its  environment  through 
fan-in  and  fan-out.  (The  chunk  model  might  also  be  a  better 
indicator  of  modifiability.) 

2.  Effort  may  be  a  suitable  metric  for  understanding  a  subsystem  in 

terms  of  its  particular  function  and  not  in  terms  of  understanding 
a  modification  to  that  subsystem  in  connection  with  the  whole 
system.  That  is,  Effort  is  not  concerned  with  the 

interconnectivity  or  interdependency  of  modules  with  one  another. 

3.  Effort  helps  to  understand  a  module  in  isolation,  i.e.,  one 
subactivity  of  the  system,  not  the  total  activity.  Eut  the 
information  flow  complexity  metric  accounts  for  the  total  activity 
required  for  maintenance  purposes. 

Therefore,  a  conclusion  that  can  be  drawn  is  that  the  Halstead  Effort 
measure  may  not  be  sensitive  to  the  integration  problem,  in  contrast  to  the 
information  flow  complexity  metric.  Similar  discrepancies  among  the  modules 
in  other  programs  of  the  set  can  also  be  justified  in  the  sane  way.  The  next 
chapter  will  further  explore  the  problem  of  measuring  integration  effort. 
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4.  Development  of  a  New  Approach  to  Measuring  Software  Effort 


| 

f. 

r," 


4.1  Motivation 

In  this  chapter,  an  attempt  is  made  to  develop  a  new  approach  for  measuring 
the  software  effort  in  a  COBOL  environment,  where  effort  will  be  based  on 
programming  time. 


> 


V. 


The  programming  time  may  be  estimated  using  the  software  science  formulas 
(2.11)  -  (2.12),  as  discussed  in  Chapter  2.  The  experiments  reported  by 
Halstead  1 3  J  showing  the  comparison  of  actual  programming  times  and  the 
estimated  programming  times  using  E  involved  only  one  subject.  Another  small 
experiment  conducted  later  shot/s  that  when  the  effort  measure  is  applied  to 
large  programs  with  multiple  modules,  it  consistently  overestimates 
programming  time  [8].  A  more  recent  set  of  experiments  suggests  that  larger 
modules  in  multi-module  programs  should  be  conceptually  broken  into  smaller 
parts  before  applying  the  E  measure  l 9 ,  10] .  It  is  observed  that  the  use  of  S 
-  18  to  convert  the  E  measure  to  T  works  best  for  modules  which  take  less  than 
two  hours  to  produce  and  which  are  less  than  50  lines  of  code  in  length. 


Our  study  has  been  conducted  in  a  COBOL  environment  in  order  to  compare  the 
actual  programming  time  and  the  estimated  programming  time.  Two  different 
sets  of  COBOL  programs  were  considered  for  this  study  (the  detailed 
descriptions  of  these  programs  were  given  in  the  previous  chapter).  Each  set 
contains  three  subjects. 

After  these  programs  were  submitted  by  the  programmers,  each  program  was 
run  through  the  Software  Science  Analyzer  [2]  at  the  Ohio  State  University  in 
order  to  get  the  various  software  science  metrics,  including  Effort.  The 
effort  value  for  each  program  along  with  the  estimated  time  (calculated  by 
using  the  equation  (2.12)  and  the  reported  time  provided  by  the  programmer  are 
shown  below.  In  the  present  analysis,  S  is  considered  to  be  18,  consistent 
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with  other  studies  which  have  examined  Halstead's  programming  time 
relationship  [3l. 


Prog. 

Program- ID 

EFFORT,  E 

Estimated 

Reported 

Set 

Time,  T, 

Time  in 

in  Hours 

Hours 

CIS 

313-L2-TC0650 

1104705 

17 

15 

1 

CIS 

313-L2-TC0671 

1241195 

19 

10.5 

CIS 

313-L2-TC0645 

1267142 

19.5 

27 

CIS 

313-L3-TC0671 

5503529 

85 

40 

2 

CIS 

313-L3-TC0645 

6895476 

106 

31 

CIS 

313-L3-TC0622 

6938627 

107 

49 

It  should  be  mentioned  that  the  programs  in  Set  1  contain  13  to  14 
paragraphs  in  the  procedure  division  while  the  programs  in  Set  2  contain 
between  32  and  36  procedure  division  paragraphs.  The  results  obtained  for 
these  two  sets  of  COBOL  programs  agree  with  the  result  obtained  by  18 1, 
namely,  as  the  number  of  modules  in  a  program  increases,  Halstead's  E 
continuously  overestimates  the  programming  time.  When  a  similar  study  was 
performed  for  smaller  programs  (e.g.,  programs  containing  only  1  to  4 
procedure  division  paragraphs),  it  was  observed  that  E  value  underestimates 
the  programming  time  as  shown  in  the  table  below. 


Program-ID 

EFFORT 

(E) 

Estimated 
Time,  T, 
in  Hours 

Reported 
Time  in 
Hour  s 

CIS 

212-L2-TC1181 

431250 

6.7 

11 

CIS 

212-L2-TC1183 

361542 

5.6 

24 

CIS 

212-L2-TC1193 

399532 

6.2 

26 

CIS 

212-L2-TC1194 

546986 

8.4 

17 

CIS 

212-L2-TC1195 

436812 

6.7 

15 

CIS 

212-L2-TC1199 

414357 

6.4 

15 

CIS 

212-L2-TC1200 

392176 

6.1 

13 

CIS 

212-L2-TC1208 

413988 

6.4 

28.5 

CIS 

212-L2-TC1210 

513006 

7.9 

27 

CIS 

212-L2-TC1211 

511548 

7.9 

15.5 

The  programs  in  this  table  were  collected  from  an  undergraduate  course  on 
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"Computer  Data  Processing"  at  Ohio  State  University.  The  program  deals 
primarily  with  the  formatted  listing  of  an  input  file,  along  v;ith  some  control 
totals,  e.g.,  total  number  of  records  in  the  file.  Each  student,  while 
writing  the  program,  is  required  to  keep  track  of  the  development  history 
using  a  shot  log.  The  shot  log  provides  the  actual  tine  spent  by  the 

programmer  at  various  stages  (e.g.,  in  designing,  in  coding,  in  debugging, 
etc. ) . 

Based  on  the  behavior  of  E,  as  observed  for  COBOL  programs  of  various 

sizes,  it  is  evident  that  a  new  approach  to  measuring  development  effort  is 
warranted.  This  chapter  is  devoted  to  addressing  this  question. 

4.2  Formulation  of  the  Approach 

Software  typically  consists  of  a  set  of  modules.  The  total  effort  required 
to  develop  a  piece  of  software  can  then  be  defined  as  the  sum  of  the  unit 

efforts  of  all  the  modules  plus  the  effort  needed  to  integrate  these  modules 

into  a  single  system.  In  other  words,  the  total  development  effort  can  be 
expressed  as : 

ETotal  =  ^  U(i)  +  Ej 

i=l  (4.2) 

and  the  estimated  development  time  might  be  computed  according  (a  la  Software 
Science)  to  the  expression 

Test  -  x  3600)  hours 

where 

U(i)  =  Effort  needed  to  write  the  ith  module  (for  COBOL,  this 

will  mean  the  appropriate  paragraph  of  procedure  division 
together  with  its  accompanying  data  division  entries) 

[UNIT  EFFORT  of  the  ith  moduleJ ; 

Ej  =  Effort  required  to  integrate  all  the  modules  to  form 
a  complete  system  (INTEGRATION  EFFORT); 


i 


q  =  Number  of  modules  (paragraphs)  in  Che  procedure  division. 


In  the  present  analyses,  the  unit  effort  U(i)  of  the  ith  module  is 
considered  to  be  the  Halstead  E  measure  for  this  module,  together  with  its 
accompanying  data  division  entries.  The  of  the  system  has  been  approached 
using  the  following  three  strategies. 

4.2.1  Strategy  1 

The  integration  effort  for  a  subsystem  is  the  effort  required  to  integrate 
or  combine  all  the  modules  contained  in  the  subsystem.  So,  intuitively,  only 
a  fraction  of  the  total  efforts  of  all  the  modules  contained  in  the  subsystem 
would  be  contributing  to  the  integration  effort.  Therefore,  one  possible 
approach  to  find  the  integration  effort  can  be  formalized  as  follows. 

E^  of  a  subsystem  =  K  lEffort  of  the  union  of  all  the 

modules  in  the  subsystem  except 
the  driver  (calling  module)) 

where  K  is  a  multiplication  factor  denoting  a  fraction  of  the  union  above. 
Let  us  arbitrarily  choose  this  fraction  to  be  1/2.  The  integration  effort  for 
each  subsystem  can  be  calculated  using  this  strategy,  which  can  then  be 
extended  to  find  the  integration  effort  for  the  entire  system.  Once  the  Ej 
for  the  whole  system  is  obtained,  the  use  of  equation  (4.1)  will  alloy?  %otai 
and  hence  Test  for  the  software  system  to  be  calculated. 

To  illustrate,  the  structure  chart  for  the  program  CIS  313-L2-TC0650  and 
the  entire  derivation  using  this  strategy  follows: 


m 


i  § 


K! 


E(sub  13)  =  e(  subt 13 , 14) ) 

-  U<13)  +  u( 14)  +  K  Ed  4) 

11  U(13)  +  u(14)  +  K  U(14)  ,  v e(14> 

E( sub  10)  -  EUub(10,  11,  12,  13,  14)) 

=  u(10)  +  U(ll)  +  u(12)  +  E( sub  13)  h 
KiU(ll,  12,  13,  14)1 

F.(sub  4)  “  ElsubU,  7,  8,  9,  5)) 


(4.3) 


=  U(4)  +  U(5)  +  U(7)  +  u(8)  +  U(9)  + 

KlU(5,  7,  8,  9)1 

E(sub  2)  -  E(sub(2,  3,  4,  5,  6,  10)) 

=  U(2)  +  U(3 )  +  E( sub  4)  +  U(5)  +  u(6 )  + 

E( sub  10)  +  K[U(3,  4,  5,. ..,14)1 

ET  tal  =  U9i;)  +  E(sub  2)  +  3 . 14 j  J 

14 

-  Z  U(i)  +  Klu(14)  +  0(5,  7,  8,  9)  + 

i-1  0(11,  12,  13,  14)  +  U>.3,  4,. ..,14)  + 

U (2 ,  3,... ,14)1 

Note:  Notation  U (nj ,  02,...,  n^)  means  the  effort  of  the 
union  of  the  modules  n^ ,  n2,...,n^. 


4.2.2  Strategy  2 

By  examining  the  second  term  of  equation  (4.3),  it  can  be  seen  that  Ej 
contains  the  effort  value  of  some  modules  more  than  once.  In  particular, 
modules  not  directly  called  by  the  subsystem  driver  are  continually 
contributing  to  the  integration  effort  of  the  subsystem  according  to  equation 
(4.3).  The  result  of  compounding  the  efforts  for  these  modules  may  cause  the 
estimated  effort  to  be  too  large. 


Therefore,  when  computing  the  integration  effort,  Strategy  2  will  count 
only  those  modules  which  are  directly  called  by  the  driver  rather  than  all  the 
modules  below  the  calling  module.  That  is, 

Ej  of  a  subsystem  -  KlEffort  of  the  union  of  all 

the  modules  directly  called  by 
the  subsystem  driver.], 

where  K  is  still  considered  to  be  1/2. 


The  detailed  derivation  for  calculating  the  E-jotal  for  the  same  system 
using  Strategy  2  follows. 

E(sub  13)  =  E(sub(13,  14)) 

«  0(13)  +  U( 14)  +  K  U(14) 


WWW 


v' 

ECsub  2)  - 

- 

:-vi 

,  !•  •  4 

fcTota3 

m 

$ 

A  - 

L hu 

= 

3  § 

h 


E( sub  10)  =  E(sub(10,  11,  12,  13,  14)) 

=  U(10)  +  U(ll)  +  U(12)  +  E( sub  13)  + 

KllKll,  12,  13)] 

Elsub  4)  =  E(sub(4,  7,  8,  9,  5)) 

-  IK4)  +  U(5)  +  U(7 )  +  U(8)  +  U(9)  + 

KlU(5,  7,  8,  9)] 

Eisub  2)  -  E(sub(2,  3,  4,  5,  6,  10)) 

-  u(2)  +  U(3)  +  E( sub  4)  +  U C 5 )  +  U(6)  +  E(si 

KlU(3,  4,  5,  6,  10)] 

ET„ *- „ i  =  U(l)  +  E( sub  2)  +  k  U(2) 

Total  14 

=  Z  mi)  +  Klm.2)  +  U(14)  +  U  (5 ,  7,  8,  9) 
i-1  UU1,  12,  13)  +  IH3,  4,  5,  6,  10)] 


(4.4) 


In  the  previous  strategies  for  finding  the  integration  effort  of  a 
subsystem,  the  multiplication  factor  K  was  arbitrarily  chosen  to  be  1/2.  In 
practice,  however,  some  modules  are  always  more  difficult  to  integrate  than 
others,  depending  on  such  factors  as  the  interaction  between  the  calling 
module  and  the  called  module,  the  number  of  times  a  particular  module  is 
called  from  various  parts  of  the  system,  etc.  In  other  words,  all  the  modules 
should  not  be  weighted  the  same,  and  there  should  be  some  means  of 
discriminating  the  multiplication  factors  associated  with  different 
subsystems.  Hence,  in  order  to  provide  better  justification  for  the  choice  of 
K,  Strategy  3  uses  the  following  rule  for  selecting  the  multiplication  factor 
to  be  associated  with  a  subsystem. 

Humber  of  data  elements  which  are  common  between  the 
subsystem  driver  and  the  union  of  the  modules  under 
consideration 

K  - - 

Total  number  of  data  elements  in  the  entire  union  of 
modules  of  the  subsystem  under  consideration 

where  0  <  K  S  1 . 

The  exact  value  of  K  will  depend  on  the  amount  of  interaction  among  the 
modules  contained  in  a  subsystem.  A  subsystem  which  has  a  large  number  of 
variables  in  common  between  the  subsystem  driver  and  the  rest  of  the  modules 
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is  evidently  more  difficult  to  integrate  than  the  one  having  less  number  of 
data  elements  in  common  between  the  driver  and  the  rest  of  the  subsystem.  In 
particular,  if  the  calling  module  of  the  subsystem  does  not  have  any  variable 
in  common  with  the  union  of  the  rest  of  the  modules,  then  K  -  0.  On  the  other 
hand,  if  the  calling  module  has  all  its  data  elements  in  common  with  the 
union,  then  K  -  1.  In  general,  however,  these  extreme  values  of  K  do  not  seem 
to  occur  very  often.  The  value  of  K,  in  most  situations,  is  observed  to  be 
less  than  1/2  and  depends  on  the  number  and  the  size  of  the  modules  contained 
in  the  union  under  consideration.  To  find  the  integration  effort  of  a 
subsystem  using  Strategy  3,  the  value  of  K,  as  calculated  using  the  rule 
stated  above,  is  incorporated  into  equation  14.3)  of  Strategy  1. 


4.3  Preliminary  Results 

All  three  strategies  discussed  in  the  previous  section  were  applied  to  two 
sets  of  COBOL  programs,  as  described  earlier  in  the  chapter  ISection  4.1). 
The  preliminary  results  obtained  in  all  three  cases  are  summarized  in  the 
table  shown  below.  The  detailed  calculations  for  each  program  have  been 
included  in  Appendix  B. 
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Program 

Estimated 

using 

Halstead's  Eq. 

Estimated 

using 

Strategy  1 

Estimated 

using 

Strategy  2 

Estimated 
using 
Strategy  3 

Reported 

Time 

in  Hours 

ID 

EFFORT 

Time 

T 

in  Hrs. 

EFFORT 

Time 

Tl 

in  Hrs. 

EFFORT 

Time 

T2 

in  Hrs. 

EFFORT 

Time 

T3 

in  Hrs. 

CIS  313-L2 
-TC0650 

1104705 

17 

1475028 

22.7 

806751 

12.4 

977778 

15.08 

15.0 

CIS  313-L2 
-TC0671 

1241195 

19 

1399156 

21.6 

727586 

11.2 

1033443 

15.9 

10.5 

CIS  313-L2 
-TC0645 

1267142 

19.5 

1413010 

21.8 

775025 

12 

1121026 

17.3 

27.0 

CIS  313-L3 
-TC0671 

5503529 

85 

6799232 

105 

2508999 

38.7 

2895952 

44.7 

40 

CIS  313-L3 
-T  CO 64 5 

6895476 

106 

6863622 

105.9 

3705425 

57 

5220200 

80 

31 

CIS  313-L3 
-TC0622 

6938627 

107 

7180899 

110 

2937307 

45 

4078366 

63 

49 

Note  that  Tl  (Test  using  Strategy  1)  is  much  higher  than  the  actual  reported 
time,  as  expected  from  the  nature  of  the  equation  (4.3).  The  summary  of  the 
results  indicates  that  in  some  cases  Strategy  2  works  better  than  Strategy  3 
and  vice  versa,  although  neither  of  these  strategies  work  uniformly  well  in 
predicting  actual  time  required  to  write  the  software.  Both  T2  and  T3  appear 
to  more  uniformly  approximate  reported  programming  time  than  do  Tl  or  T  ithe 
Halstead  estimate),  particularly  on  the  larger  program  (Set  2).  While  this 
evidence  is  very  preliminary  and  inconclusive,  it  does  suggest  that  new 
approaches  to  measuring  integration  effort  may  yield  more  useful 
approximations  of  development  time. 
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4.4  Further  Refinement 

The  results  of  the  previous  section  suggest  that  further  refinement  of  the 
models  proposed  is  required  to  obtain  more  reasonable  agreement  between  the 
estimated  time  and  the  reported  time.  Some  of  the  issues  which  seem  to  be 
helpful  for  further  refinements  of  the  models  are  described  below. 

1.  The  interaction  between  the  subsystem  driver  and  each  individual 
module  contained  in  the  subsystem  should  be  more  carefully 
considered.  Consider  an  example  as  shown  below,  where  the  dr-iver  P 
calls  two  modules  Al  and  A2  sequentially. 


I  P  I 


/  \ 

/  \ 

/  \ 


Suppose  the  set  of  data  elements  that  appear  in  P  is  A,  B,  C,  M, 
N.  Now  two  different  cases  can  be  considered  depending  on  how  the 
data  elements  appear  in  modules  Al  and  A2.  For  example,  in  one 
case  Al  and  A2  may  contain  the  data  elements  A,  B,  C,  L  and  A,  B, 
C,  H  respectively.  In  the  other  case  Al  and  A2  may  contain  data 
elements  A,  B,  L  and  C,  H  respectively.  According  to  Strategy  3 
(discussed  in  Section  4.2),  in  both  the  cases  the  multiplication 
factor  K  =  3/5.  The  integration  effort  for  this  subsystem  is, 
therefore , 

K l U ( Al ,  A2)j 

=  3/5lU(Al,  A2)J 

where  U(Al,  A2)  denotes  the  effort  when  Al  and  A2  are  combined 
together . 

It  should  be  noted  that  the  multiplication  factor  K  is  used  to 
define  the  interaction  between  a  calling  module  and  the  rest  of  the 
modules  in  the  subsystem.  In  the  present  example,  there  is  clearly 
more  interaction  between  the  driver  and  the  called  modules  in  the 
first  case  than  there  is  in  the  second  case.  Intuitively,  one 
might  therefore  expect  the  first  case  to  require  more  integration 
effort  than  the  second  case.  That  is,  the  value  of  K  in  the  first 
case  should  be  greater  than  3/5,  and  in  the  second  case  it  should 
be  less  than  3/5.  To  overcome  this  problem,  we  might  separately 
calculate  K  (using  the  rule  of  Strategy  3)  between  the  driver  and 
each  individual  module  contained  in  the  subsystem,  and  then  take 
the  average  of  all  K's.  This  average  value  of  K  may  be  used  as  the 


V.'.-V 
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multiplication  factor  associated  with  the  subsystem.  In  the  first 
case  of  the  current  example,  the  interaction  between  P  and  Al  would 
be  given  by 

Kx  =  3/4 

and  that  between  P  and  A2  is  also 
K2  -  3/4. 

Hence,  K  for  the  subsystem  is  found  to  be 
Kj  +  K2 

k  - - -  3/4  (which  is  >  3/5). 

2 

The  similar  calculations  for  the  second  case  give 

-  2/3,  K2  =  1/2  and  K  =  7/12  (which  is  less  than  3/5). 

The  above  discussion  and  the  results  of  the  previous  section 
indicate  that  a  reasonable  model  of  integration  effort  might  use 
Strategy  2  (discussed  in  Section  4.2)  with  the  K  value  to  be 
calculated  using  the  strategy  described  above  instead  of  the 
arbitrarily  chosen  value  of  K  =  1/2. 

2.  In  Chapter  2,  it  was  realized  that  both  the  information  flow 
complexity  and  the  chunk  model  complexity  give  strong  emphasis  to 
the  connectivity  among  system  components.  Therefore,  it  may  be 
possible  to  apply  the  idea  of  information  flow  and  chunk  model  for 
evaluating  the  integration  effort  of  a  system,  since  the 
integration  effort  primarily  accounts  for  the  connectivity  among 
different  system  components. 

It  is  our  opinion  that,  once  a  more  accurate  method  for  calculating  the 
integration  effort  associated  with  a  system  is  found,  more  reliable  estimates 
of  the  actual  development  time  for  a  piece  of  software  can  be  obtained. 


5.  Conclusion 

The  existence  of  the  Software  Science  Analyzer  developed  at  Ohio  State 
University  helps  to  collect  and  analyze  a  large  number  of  COBOL  programs. 
Various  kinds  of  analyses  using  these  programs  are  possible,  including  those 
initiated  in  this  report.  The  major  part  of  the  report  was  devoted  to  the 
validation  of  the  software  science  metrics  on  the  basis  of  the  analyses  of  a 
large  number  of  COBOL  programs.  It  was  observed  that  for  CIS  212  programs 
(small  programs),  the  best  estimate  of  the  program  length  is  attained  for  the 
entire  program  rather  than  for  the  data  or  procedure  division  alone.  However, 
for  the  larger  CIS  313  and  University  Systems  programs,  the  length  equation 
works  equally  well  both  for  the  entire  program  as  well  as  for  the  procedure 
division  alone.  But  since  the  data  division  is  a  significant  part  of  any 
COBOL  program  and  may  require  a  considerable  amount  of  programming  effort,  we 
recommend  that  it  be  included  in  software  science  studies.  Secondly,  software 
science  postulates  that  the  language  level  (A)  would  be  constant  for  ail 
programs  written  in  a  given  language.  The  present  analyses,  however,  indicate 
that  the  language  level  is  not  constant.  Its  use  in  other  software  science 
relationships  is  therefore  suspect,  and  it  is  not  recommended  as  a  useful 
metric  to  be  applied  to  an  individual  program.  Finally,  the  estimated 
programming  time  (as  calculated  from  the  Effort  metric)  provided  tantal i zing ly 
good  values  for  many  of  the  smaller  student  programs,  but  failed  to  produce 
good  results  when  applied  to  larger  programs.  We  feel  that  this  is  due  to  a 
faulty  capturing  of  integration  effort  by  the  Halstead  E  measure.  For  small 
programs,  integration  is  of  minimal  importance,  so  E  may  work  well.  For  large 
programs,  however ,  integration  effort  is  critical. 


Ue  also  studied  the  interrelationships  among  several  software  metrics, 
namely,  Halstead's  effort,  McCabe's  cyclomatic  complexity,  Kafura's 
information  flow  complexity  and  Davis'  chunk  model  complexity.  In  particular, 
we  studied  the  behavior  of  each  of  these  measures  as  estimates  of  programming 
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software  effort,  measures  in  which  we  can  place  much  confidence.  We  have 
suggested  some  approaches  for  further  study,  and  reasons  why  we  believe  they 
may  be  fruitful  directions  to  pursue.  The  approaches  are  in  some  ways 
derivatives  of  existing  measures,  so  that  past  work  may  not  all  be  in  vain. 
They  attempt  to  promote  the  strengths  of  existing  measures  while  correcting 
observed  weaknesses.  Without  a  fairly  solid  theory  on  which  to  rest  the 
development  of  software  metrics,  this  appears  to  be  the  best  we  can  do. 
Laboratory  experiments  of  the  future  will  attest  to  the  merits  of  any  new 
approaches.  The  current  metrics  of  software  complexity,  however,  appear  weak 
and  very  incomplete. 
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The  detailed  counting  strategy  used  to  calculate  the  fan-in  and  fan-out 
associated  with  each  nodule  of  a  COBOL  program  is  described.  The  explicit 
calculations  for  finding  the  information  flow  complexity  metric  and  the  chunk 
model  complexity  metric  corresponding  to  each  program,  considered  in  Chapter 
3,  are  also  included. 

1.1  Counting  Strategy  Used  to  Find  the  Information  Flow  Complexity  for  COBOL 
Programs 

The  information  flow  complexity  of  a  COBOL  program  was  determined  based  on 
the  following  strategies. 

1.  Each  paragraph  of  the  procedure  division  1PD)  was  considered  as  a 
separate  procedure. 

2.  Two  different  complexity  values  were  derived  using  two  different 
length  values  as  follows. 

a.  Length 

-  Number  of  statements  iNOS)  of  the  particular  procedure 
under  consideration 

-  Number  of  verbs  in  the  paragraph 

b.  Length 

-  (NOS  of  the  particular  paragraph  in  PD1  +  1N0S  in  the  data 
division  entries  associated  with  this  paragraph) 

=  (vverbs  in  the  PD  paragraph)  +  (v  periods  in  the  associated 
data  division  entries) 

The  Data  Structures  (DS)  retrieved  and  updated  are  counted  by 
assessing  the  number  of  referenced  and  assigned  data  items, 
respectively,  based  on  the  semantics  of  the  various  COBOL 
statements.  Some  of  the  rules  followed  are  listed  below. 


COBOL  Statements 


(a)  MOVE  identifier-1  TO  identifier-2. 
MOVE  TO  Identifier. 


e.g.,  MOVE  1  TO  FLAG; 

MOVE  ’BILL’  TO  NAME-IN. 


(b)  ADD  constant  to  Identifier 
e.g. ,  ADD  2  TO  COUNT. 

ADD  id-1,  id-2  GIVING  id-3 
e.g.,  ADD  A,  B  GIVING  C. 

ADD  id-1  TO  id-2 
e.g.,  ADD  A  TO  B. 


(c)  IF  A  -  B  THEN 


IF  A  IS  NUMERIC  THEN 


(d)  READ  filename  AT  END 

MOVE  ’YES’  TO  FLAG. 


(e)  WRITE  OUT-REC  FROM 

detail-line  AFTER 
ADVANCING  2  LINES. 


’•WRITE  OUT-REC  FROM  Heading-1. 
<WRITE  OUT-REC  FROM  Heading-2. 


(f)  SORT  Sort-file 


USING  file-1 
GIVING  file-2. 


DS  retrieved  *  1,  DS  updated  ■  1 
DS  retrieved  *  0,  DS  updated  •  1 


DS  retrieved  ■  1,  DS  updated  *  1 


retrieved  ■  2,  DS  updated  -  1 


DS  retrieved  -  2,  DS  updated  -  1 


DS  retrieved  -  2,  DS  updated  «  0 
DS  retrieved  ■  1,  DS  updated  ■  0 


DS  retrieved  ■  1,  DS  updated  -  1 


DS  retrieved  -  1,  DS  updated  »  1 


DS  retrieved  »  2,  DS  updated  »  1 


DS  retrieved  *  1,  DS  updated  •  1 
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1.2  Calculations  for  Finding  Information  Flow  Complexity  for  Each  Program 

The  explicit  values  of  fan-in  and  fan-out  corresponding  to  each  module 
(paragraph)  along  with  their  information  flow  complexity  values  are  shown 
below.  For  each  module  two  different  values  of  length  are  considered,  hence 
two  different  complexity  values  are  shown.  The  first  value  corresponds  to  the 
length  of  the  procedure  division  paragraph  together  with  the  associated  data 
division  entries.  The  second  value  of  the  length  refers  to  the  length  of  the 
procedure  division  paragraph  only.  The  fan-in  and  fan-out  for  each  module 
consists  of  two  distinct  operands.  The  first  operand  denotes  the  number  of 
local  flow(s)  to  (or  from!  the  module  considered.  The  second  operand  refers 
to  the  number  of  data  structures  from  which  the  module  retrieves  information 
(or  which  the  module  updates). 

1.3  Calculations  for  Finding  Chunk  Model  Complexity  for  Each  Program 

In  order  to  find  the  chunk  model  complexity,  the  fan-in  for  each  chunk 
(performed  paragraph)  is  determined.  The  fan-in  for  each  chunk  is  the  sum  of 
the  number  of  control  connections  and  the  number  of  data  connections  to  that 
chunk  (as  described  in  Section  3.4).  The  complexity  of  the  entire  program  can 
then  be  calculated  using  the  formula  shown  in  Section  3.4.  In  this  section, 
the  fan-in  value  for  each  chunk  is  explicitly  shown  as  the  sum  of  tvo  distinct 
operands.  The  first  operand  refers  to  the  number  of  control  connections  to 


the  particular  chunk 

,  and 

the  second  operand  denotes  the 

number 

of 

data 

connections 

to  that 

chunk. 

The  specific  chunks 

having 

control 

or 

data 

connections 

to  each 

chunk 

are  indicated  to  the 

right 

of  each 

fan-in 

computat ion 

(e.g.,  2 

dl  means  that  chunk  2  has  a 

data  connection 

to 

chunk 

1).  The  detailed  calculations  for  finding  the  chunk  model  complexity  for  each 
program  are  shown  below.  The  fan-in  for  the  ith  chunk  is  denoted  by  f^. 
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MODULE 

mm 

FAN-OUT 

LENGTH 

(NOS) 

CFC  - 
(Fan-In 

Length*  2 
*  Fan-out) 

1 

1+4-5 

1+4-5 

41 

8 

25625 

5000 

2 

2+10-12 

5+22-27 

76 

24 

7978176 

2519424 

3 

1+1-2 

4+2-6 

33 

11 

4752 

1584 

4 

2+1-3 

0+2-2 

24 

2 

864 

72 

5 

1+3-4 

0+4-4 

31 

6 

7936 

1536 

6 

1+2-3 

0+3-3 

27 

4 

2187 

324 

7 

1+2-3 

0+2-2 

24 

2 

864 

72 

8 

1+2-3 

0+4-4 

30 

6 

4320 

288 

9 

4+4-8 

3+4-7 

36 

9 

112896 

28224 

10 

2+14-16 

1+3-4 

85 

56 

348160 

229375 

11 

1+8-9 

0+4-4 

42 

8 

54432 

10368 

12 

1+8-9 

0+4-4 

42 

8 

54432 

10368 

13 

2+8-10 

0+4-4 

42 

8 

67200 

12800 

8661844 

2819435 

INFORMATION  FLOW  COMPLEXITY 
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MODULE 

FAN-IN 

FAN-OUT 

LENGTH 

D+P 

(NOS) 

P 

CFC  - 
(Fan-in 

Length*  2 
*  Fan-out) 

1 

1+2-3 

1+2-3 

29 

6 

2349 

486 

2 

1+2-3 

5+12-17 

38 

12 

98838 

31212 

3 

1+15-16 

0+11-11 

69 

19 

2137344 

588544 

4 

1+1-2 

4+2-6 

30 

12 

4320 

1728 

5 

2+2-4 

0+4-4 

45 

8 

11520 

2048 

6 

1+8-9 

0+9-9 

60 

19 

393660 

124659 

7 

1+3-4 

0+4-4 

43 

6 

11008 

1536 

8 

1+3-4 

0+4-4 

45 

8 

11520 

2048 

9 

1+3-4 

0+3-3 

44 

7 

6336 

1008 

10 

1+1-2 

3+0-3 

7 

3 

252 

108 

11 

1+13-14 

0+9-9 

50 

17 

793800 

269892 

12 

1+14-15 

0+9-9 

86 

19 

1567350 

346275 

13 

1+9-10 

1+1-2 

33 

18 

13200 

7200 

14 

1+2-3 

0+2-2 

12 

2 

432 

72 

5051929 

1376816 

information  flow  complexity 


TC0645 


MODULE 

FAN-IN 

FAN-OUT 

LENC7TH 

D+P 

(NOS) 

P 

CFC  - 
(Fan-in 

Length*  2 
*  Fan-out) 

i 

1+4-5 

1+4-5 

38 

8 

23750 

5000 

2 

2+8-10 

5+17-22 

75 

27 

3630000 

1306800 

3 

1+1-2 

3+2-5 

30 

11 

3000 

1100 

4 

1+2-3 

0+4-4 

41 

6 

5904 

864 

5 

1+8-9 

0+8-8 

65 

22 

336960 

114048 

6 

1+3-4 

0+4-4 

41 

6 

10496 

1536 

7 

1+3-4 

0+4-4 

41 

6 

10496 

1536 

8 

1+3-4 

0+5-5 

55 

19 

22000 

7600 

9 

4+5-9 

3+4-7 

40 

13 

158760 

51597 

10 

2+11-13 

0+3-3 

58 

39 

88218 

59319 

11 

1+5-6 

0+5-5 

42 

8 

37800 

7200 

12 

1+5-6 

0+5-5 

42 

8 

37800 

7200 

13 

1+5-6 

0+5-5 

42 

a 

37800 

7200 

4402984 

1571000 

information  flow  complexity 

TC0645,  Lab  3,  CIS  313 


3DULE 

FAN-IN 

(I) 

FAN-OUT 

(0) 

Length 

(L) 

IFC  - 

L*(I*0) 2 

1 

4+8-12 

7+11-18 

122 

25 

5692032 

1166400 

2 

3+8-11 

5+18-23 

75 

28 

4800675 

1792252 

3 

1+1-2 

3+2-5 

30 

11 

30000 

1100 

4 

1+2-3 

0+4-4 

41 

6 

5904 

864 

5 

1+8-9 

0+8-8 

65 

22 

336960 

114048 

6 

1+3-4 

0+4-4 

41 

6 

10496 

1536 

7 

1+3-4 

0+4-4 

41 

6 

10496 

1536 

8 

1+3-4 

0+5-5 

55 

19 

22000 

7600 

9 

6+9-15 

3+10-13 

68 

19 

2585700 

722475 

10 

1+11-12 

0+3-3 

55 

36 

71280 

46656 

11 

1+5-6 

0+5-5 

42 

8 

37800 

7200 

12 

1+5-6 

0+5-5 

42 

8 

37800 

7200 

13 

1+5-6 

0+5-5 

42 

8 

37800 

7200 

14 

2+5-7 

0+5-5 

42 

8 

51450 

9800 

15 

3+2-5 

0+2-2 

17 

2 

1700 

200 

16 

1+2-3 

0+2-2 

20 

2 

720 

72 

17 

6+2-8 

0+1-1 

51 

4 

3264 

256 

18 

5+2-7 

4+0-4 

35 

4 

27440 

3136 

19 

1+3-4 

1+2-3 

41 

6 

5904 

864 

20 

4+7-11 

3+9-12 

83 

15 

1446192 

261360 

21 

3+9-12 

0+14-14 

| 

80 

24 

2257920 

677376 

22 

1+1-2 

6+0-6 

32 

17 

4608 

2448 

23 

2+10-12 

1+15-16 

110 

20 

4055040 

737280 

24 

2+5-7 

2+6-8 

73 

16 

228928 

50176 

25 

2+7-9 

2+7-9 

85 

14 

557685 

91854 

26 

2+4-6 

2+5-7 

83 

12 

146412 

21168 

27 

2+5-7 

2+6-8 

73 

16 

228928 

50176 

28 

2+4-6 

2+6-8 

74 

16 

170496 

36864 

29 

6+5-11 

0+4-4 

62 

5 

120032 

9680 

30 

6+6-12 

0+14-14 

68 

7 

1919232 

197  568 

31 

5+6-11 

1+8-9 

80 

9 

784080 

88209 

32 

1+4-5 

0+5-5 

33 

7 

20625 

25709599 

4375 

6118929 
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INFORMATION  FLOW  COMPLEXITY 
TC0671,  Lab  3,  CIS  313 


MODULE 

FAN-IN 

(I) 

FAN-OUT 

(0) 

Length  (L) 

- 2 

IFC  -  L*(I*0) 

1 

1+1-2 

3+3-6 

150 

13 

21600 

1872 

2 

2+4-6 

1+4-5 

27 

5 

24300 

4500 

3 

3+2-5 

4+0*4 

22 

4 

8800 

1600 

4 

2+2-4 

0+2-2 

21 

2 

1344 

128 

5 

2+2-4 

0+2-2 

20 

2 

1280 

128 

6 

4+2-6 

0+1-1 

56 

4 

2016 

144 

7 

4+2-6 

5+0-5 

38 

5 

34200 

4500 

8 

1+3-4 

1+2-3 

47 

6 

6768 

864 

9 

2+2-4 

3+3-6 

42 

8 

24192 

4608 

10 

1+7-8 

0*4*4 

37 

8 

37888 

8192 

11 

2+12-14 

0+15-15 

70 

14 

3087000 

617400 

12 

1+1-2 

6+0-6 

35 

17 

5040 

2448 

13 

4+5-9 

2+7-9 

58 

12 

380  538 

78732 

14 

6+10-16 

0+14-14 

73 

14 

3662848 

702464 

15 

1+0-1 

0+1-1 

19 

1 

19 

1 

16 

2+11-13 

85 

14 

517140 

85176 

17 

2+4-6 

65 

14 

37440 

8064 

18 

2+5-7 

82 

12 

100450 

14700 

19 

2+5-7 

80 

10 

98000 

12250 

20 

2+4-6 

67 

14 

38592 

8064 

21 

HUI  $meh 

2+3-5 

65 

13 

14625 

2925 

22 

7+11-18 

1+9-10 

45 

16 

1458000 

518400 

23 

3+14-17 

1+14-15 

57 

16 

3706425 

1040400 

24 

1+5-6 

0+4-4 

41 

7 

23616 

4032 

25 

2+11-13 

5+22-27 

73 

24 

8993673 

2956824 

26 

1+1-2 

4+2-6 

33 

11 

4752 

1584 

27 

2+1-3 

0+2-2 

24 

2 

864 

72 

28 

1+3-4 

0+4-4 

31 

6 

7936 

1536 

29 

1+2-3 

0+3-3 

27 

4 

2187 

324 

30 

1+2-3 

0+2-2 

24 

2 

864 

72 

31 

1+2-3 

0+4-4 

30 

6 

4320 

864 

32 

7+5-12 

3+5-8 

53 

10 

488448 

92160 

33 

2+15-17 

1+3-4 

85 

56 

393040 

258944 

34 

1+7-8 

0+4-4 

42 

8 

43008 

8192 

35 

1+7-8 

0+4-4 

42 

8 

43008 

8192 

36 

3*7-10 

0+4-4 

43 

8 

68800 

2  3  34  3021 

12800 

5913156 

rm\ 
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INFOBMAIION  FLOW  COMPLEXITY 


TC0622,  Lab  3,  CIS  313 


MODULE 

FAN-IN 

(I) 

FAN-OUT 

(0) 

Length 

(L) 

IFC  - 

L*(I*0)2 

1 

4+8-12 

6+12-18 

124 

30 

5785344 

1399680 

2 

1+1-2 

0+2-2 

12 

2 

192 

32 

3 

1+5-6 

1+3-4 

34 

10 

19584 

5760 

4 

2+11-13 

5+18-23 

63 

29 

5632263 

2592629 

5 

1+7-8 

4+2-6 

31 

12 

71424 

27648 

6 

1+2-3 

0+2-2 

16 

3 

576 

108 

7 

1+2-3 

0+2-2 

16 

3 

576 

108 

8 

1+2-3 

0+2-2 

17 

4 

612 

144 

9 

2+1-3 

0+1-1 

12 

2 

108 

18 

10 

1+3-4 

0+3-3 

21 

6 

3024 

864 

11 

4+15-19 

3+13-16 

82 

29 

7578112 

2680064 

12 

1+5-6 

0+5-5 

35 

11 

31500 

9900 

13 

2+7-9 

0+5-5 

37 

12 

74925 

24300 

14 

2+11-13 

0+3-3 

59 

39 

89739 

59319 

15 

2+8-10 

0+6-6 

31 

9 

111600 

32400 

16 

2+9-11 

0+7-7 

42 

10 

249018 

59290 

17 

4+2-6 

0+1-1 

45 

2 

1620 

72 

18 

5+2-7 

4+0-4 

31 

4 

24304 

3136 

19 

2+8-10 

1+7-8 

59 

11 

377600 

70400 

20 

4+5-9 

4+7-11 

64 

13 

627264 

127413 

21 

3+7-10 

2+9-11 

85 

16 

1028500 

193600 

22 

8+6-14 

0+5-5 

39 

11 

191100 

53900 

23 

1+1-2 

6+0-6 

18 

6 

2592 

864 

24 

2+11-13 

0+12-12 

37 

12 

900432 

292032 

25 

2+8-10 

4+14-18 

108 

25 

3499200 

810000 

26 

2+6-8 

4+6-10 

80 

28 

512000 

179200 

27 

2+4-6 

5+4-9 

54 

17 

157464 

49572 

28 

2+6-8 

4+7-11 

83 

18 

642752 

139392 

29 

2+6-8 

4+6-10 

80 

28 

512000 

179200 

2+5-7 

4+5-9 

80 

28 

317520 

111132 

31 

6+8-14 

0+18-18 

75 

18 

4762800 

1143072 

32 

1+3-4 

0+3-3 

31 

3 

4464 

432 

33 

1+15-16 

1+14-15 

60 

17 

3456000 

979200 

34 

1+5-6 

0+5-5 

36 

11 

32400 

9900 

35 

7+3-10 

1+3-4 

29 

5 

46400 

8000 

36 

7+3-10 

0+2-2 

15 

3 

6000 

1200 

36751009 

11243981 

114 


Program:  CIS  313  -  L2  -  TC0650 


fl 

a 

0+1 

a 

1 

2  -*•  dl 

f2 

a 

1+1 

m 

2 

1  -*■  c2,  13  +  d2 

f3 

m 

1+0 

a 

1 

2  -*•  c3 

f4 

m 

1+1 

a 

2 

2  -*■  c4,  13  -*■  d4 

f5 

a 

2+1 

a 

3 

2  -*■  c5,  4  c5. 

V 

m 

1+2 

a 

3 

2  -*■  c6,  10  d6 

f7 

a 

1+1 

a 

2 

4  +>  c7,  13  -*•  d7 

f8 

a 

1+1 

a 

2 

4  c8,  13  d8 

f9 

a 

1+1 

a 

2 

4  -*■  c9,  13  -*•  d9 

f10 

a 

1+0 

a 

1 

2  +  c  10 

fll 

a 

1+0 

a 

1 

10  -►  ell 

f12 

a 

1+1 

a 

2 

10  -*•  cl2,  14  -*• 

f13 

a 

1+1 

a 

2 

10  -*■  cl3,  14 

f14 

a 

1+0 

a 

1 

13  +  cl4. 

C  -  (ci+c3+cio+cll+clV  *  S  R  +  (C2'l^4+C7+C8+C9+C12+C13)  *  E  R  + 

m-0  m»0 

3 

(c.+c,)  •  E  Rm. 
j  o  „ 

tn»0 


(29+69+7+50+12)  1*666  +  (38+30+43+45+44+86+33)  2*11  + 


(45+60)  2*406 
1204 


Cp  -  (6+1 9+3+17+2)  1*666  +  (12+12+6+8+7+19+18)  2*11  +  (8+19)  2*406. 

-  316 

Note:  Cp+Q  is  the  complexity  of  the  program  when  length  for  each  module 

contains  the  length  of  the  PD  paragraph  along  with  the  associated  DD 
entries.  Cp  is  the  complexity  of  the  program  when  length  of  each 

module  refers  to  the  length  of  the  PD  paragraph  only. 


Program: 

CIS  313 

-  12  - 

TC0671 

fl 

- 

0+1 

3 

1 

2 

dl 

f2 

3 

1+2 

« 

3 

1 

c2, 

9  d2. 

10  -*■ 

d2 

f  3 

- 

1+1 

a 

2 

2 

c3. 

10  -*■  d3 

f4 

= 

2+1 

3 

3 

2 

c4. 

3  c4. 

10  -*• 

d4 

f5 

a 

1+2 

- 

3 

2 

c5. 

9  -  d5. 

10  -*• 

d5 

f6 

m 

1+1 

= 

2 

3 

-4* 

c6. 

10  -*■  d6 

f  7 

a 

1+1 

a 

2 

3 

c7 , 

10  ->  d7 

f8 

3 

1+1 

a 

2 

3 

c8. 

10  -*•  d8 

f  9 

- 

1+1 

- 

2 

2 

•> 

c9. 

10  -*■  d9 

fio 

1+0 

- 

1 

9 

clO 

fll 

s 

1+0 

3 

1 

2 

-► 

ell 

and  2  -*• 

dll 

f12 

a 

1+0 

S 

1 

9 

cl2 

and  9  -► 

dl2 

f13 

« 

2+0 

- 

2 

9 

c!3, 

,  dl3;  10  -*■  cl3. 

1  2 

C  «  (c1+c10+c1i+ci2^  *  Z  Rm  +  (c3+c6+c7+c8+cg+ci3)  *  Z  R™ 

m«0  m=0 

3 

(c2+c4+c5)  •  Z  Rm. 


CP+D  =  (41+85+42+42)  1*666  +  (33+27+24+30+36+42)  2*11  + 
(76+24+31)  2*406 
-  1070 


Cp  =  (8+56+8+8)  1*666  +  (11+4+2+6+0+8)  2*11  +  (24+2+6)  2*406 
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Program:  CIS  313  -  L2  -  TC0645 


fl 

a 

0+1 

3 

1 

2 

dl 

f2 

= 

1+2 

a 

3 

1 

c2, 

9  -►  d2 ,  10 

f3 

= 

1+1 

a 

2 

2 

-► 

c3. 

10 

d3 

f4 

= 

1+1 

3 

2 

2 

-► 

c4, 

10 

d4 

f5 

- 

1+1 

= 

2 

2 

->* 

c4, 

10 

d5 

f6 

3 

1+1 

a 

2 

3 

-** 

c6. 

10 

-> 

d6 

f7 

- 

1+1 

= 

2 

3 

->* 

c7, 

10 

d7 

f8 

3 

1+1 

* 

2 

3 

-► 

c8. 

10 

d8 

f9 

3 

1+1 

= 

2 

2 

->• 

c9, 

10 

d9 

fio 

3 

1+0 

= 

1 

9 

-► 

clO 

fll 

1+0 

= 

1 

2 

ell 

,  dll 

f12 

= 

1+0 

= 

1 

9 

-►* 

cl  2 

,  dl2 

f13 

a 

1+1 

a 

2 

9 

cl3 

,  10 

dl3 

1  2 

C  «  (c1+c10+cu+c12)  •  E  Rm  +  (C3+c4+c5+c6+c74-c8+c9+c13)  •  E  Rm  + 

m=0  m=0 

(cj  •  E  Rm. 
np=0 


C  -  (38+58+42+42)  1*666  +  (30+41+65+41+41+55+40+42)  2*11  +  (75x2*406) 

Dti 

**  1229 


Cp  =  (8+39+8+8)  1*666  +  (11+6+22+6+6+19+13+8)  2*11  +  (27x2*406) 


362 


& 

u 
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Program:  CIS  313  -  L3  -  TC0645 


ft 


■ 

fl 

■ 

0+17 

- 

17 

f17 

- 

2+5 

- 

7 

1 

*  > 

f2 

1+4 

m 

5 

f18 

a 

1+0 

a 

1 

f3 

1+2 

m 

3 

f19 

■ 

1+9 

a 

10 

f4 

1+2 

m 

3 

f20 

a 

1+10 

a 

11 

f5 

1+2 

m 

3 

f21 

a 

1+1 

a 

2 

?! 

f6 

1+2 

m 

3 

f22 

a 

140 

a 

1 

\ 

f7 

1+2 

m 

3 

£23 

a 

1+8 

a 

9 

•o 

f8 

1+2 

m 

3 

f24 

a 

1+7 

a 

8 

V 

f9 

1+3 

m 

4 

f25 

a 

1+6 

a 

7 

■  * 

f10 

1+2 

m 

3 

f26 

a 

1+5 

a 

6 

•  9 

fll 

1+1 

m 

2 

f27 

1+4 

a 

5 

f12 

1+1 

m 

2 

f28 

1+2 

a 

3 

f13 

1+4 

m 

5 

f29 

a 

5+9 

a 

14 

■ 

f14 

2+11 

m 

13 

f30 

a 

6+9 

a 

15 

f15 

3+4 

m 

7 

f31 

a 

1+0 

m 

1 

f16 

1+2 

m 

3 

f  32 

a 

1+0 

a 

1 

P  s. 


& 


Use  of  the  notation 

R  -  I  R" 

m“0 


C  "  <C18+C22+C31+,C32)  "l  +  (cll+c12+c21)  R2  +  (c3+c4+c5+c6+c7+c8+cl0+c16+c28)  R3  + 
c9R4  +  (c2+c13+c27>  R5  +  C26R6  +  (c15+cl7+c25)  R7  +  C24R8  +  C23R9  + 

^lg’S.O5  +  C20R11  +  C14R13  +  C29R14  +  (c30RI5>  +  (cIRI7) 


CP+D  “  (25+32+80+33)  1*666  +  (42+42+80)  2*11  +  (30+41+65+41+41+55+55+20+74)  2*406  + 
(68*2*603)  +  (75+42+73)  2*734  +  (83*2*822)  +  (17+51+85)  2*88  +  (73*2*919)  + 
(110*2*945)  +(41*2*962)  +(83*2*973)  +  (42*2*978)  +  (62*2*981)  + 

(68*2*983)  +  (122*2*984) 

-  4814 


Cj,  -  (4+17+9+7)  1*666  +  (8+8+24)  2*11  +  (11+6+22+6+6+19+36+2+16)  2*406  + 

(19*2*603)  +  (28+8+16)  2*734  +  (12*2*822)  +  (2+4+14)  2*88  +  (16*2*919)  + 
(20*2*945)  +  (6*2*962)  +  (15*2*973)  +  (8*2*978)  +  (5*2*981)  + 

(7*2*983)  +  (25*2*984) 

-  1031 


Program:  CIS  313  -  13  -  TC0671 


fl 

■ 

0+2 

- 

2 

f19 

■ 

1+2 

-  3 

f2 

1+7 

m 

8 

£20 

1+3 

*  4 

f3 

1+0 

- 

1 

f21 

1+1 

-  2 

f4 

2+5 

m 

7 

f22 

6+0 

■  6 

f5 

2+2 

m 

4 

f23 

1+0 

-  1 

f6 

2+4 

m 

6 

f24 

1+0 

-  1 

f7 

1+0 

m 

1 

f25 

1+4 

-  5 

f8 

1+8 

m 

9 

f26 

1+2 

-  3 

f9 

1+1 

m 

2 

f27 

2+2 

■  4 

f10 

1+3 

m 

4 

f28 

1+2 

-  3 

fll 

2+2 

m 

4 

f29 

1+2 

-  3 

f12 

1+0 

m 

1 

f30 

1+2 

-  3 

f13 

1+0 

m 

1 

f31 

1+2 

-  3 

f14 

6+2 

m 

8 

f32 

1+3 

-  4 

f15 

1+0 

m 

1 

f33 

1+0 

-  1 

f16 

1+10 

m 

11 

f34 

1+1 

-  2 

f17 

1+4 

m 

5 

f35 

1+1 

«  2 

f18 

m 

1+2 

m 

3 

f36 

m 

3+3 

*  6 

C  -  (c3+c7+c12+c13+c15+c23+c24+c33)  ^  +  (c1+c9^21+c34+c35)  R2  + 

(C18+C19+C26+C28+C29+C30+C31)  R3  +  <C5+C10+C11+C20+C27+C32)  R4  + 
£c17+c25^  R5  +  *C6+C22+C36^  R6  +  ^c4*R7^  +  R3  +  ^C8’R9^  + 

(c16’R11) 

Cj^P  -  (22+384-35+58+19+57+41+85)  1*666  +  (158+42+65+42+42)  2*11  + 
(82+80+33+31+27+24+30)  2*406  +  (28+37+78+67+24+53)  2*603  + 

(65+73)  2*734  +  (56+45+43)  2*822  +  (21x2-88)  +  (27+73)  2*919  + 
(47x2*945)  +  (85x2-973) 

-  4280 

Cp  -  (4+5+17+12+1+16+7+56)  1*666  +  (13+8+13+8+8)  2*11  +  (12+18+11+6+4+2+6) 
(2+8+14+14+2+10+)  2*603  +  (14+24)  2*734  +  (4+16+8)  2*822  +  (2x2*88)  + 
(5+14)  2*919  +  (6x2*945)  +  (14x2*973) 


1 


•406  + 


Program:  CIS  313  -  U  -  TC0622 


fl 

0+6  « 

6 

f19  " 

1+10 

m 

11 

f2 

1+1  - 

2 

f20  " 

1+8 

m 

9 

f3 

1+2  - 

3 

f21  ’ 

1+8 

m 

9 

f4 

1+6  « 

7 

f22  ’ 

8+8 

m 

16 

f5 

1+2  - 

3 

f23  ' 

1+0 

m 

1 

f6 

1+2  - 

3 

f24- 

2+0 

m 

2 

f7 

1+2  - 

3 

f25  ' 

1+11 

m 

12 

f8 

1+2  - 

3. 

f26" 

1+8 

m 

9 

f9 

2+1  - 

3 

f27  ' 

1+8 

m 

9 

f10 

1+2  - 

3 

f28  " 

1+9 

m 

10 

fll 

1+4  ■ 

5 

f29  “ 

1+8 

m 

9 

f12 

1+1  - 

2 

f30  “ 

1+8 

m 

9 

f,. 

2+3  - 

5 

f ,,  • 

6+0 

m 

6 

13 

31 

f14 

1+3  - 

4 

f32  “ 

1+1 

m 

2 

f15 

2+7  - 

9 

f33  * 

1+0 

m 

1 

f16 

2+2  » 

4 

f34  • 

1+0 

m 

1 

f17 

2+3  - 

5 

f35  * 

6+1 

m 

7 

f18 

1+0  - 

1 

f36’ 

6+1 

m 

7 

C  - 

(C18+C23+C33+C34) 

*i  +  (c: 

-+C12"1 

N:24+C32) 

(c14+clfi)  R,  +  (=11+c13+c17)  Rj  +  (ci+c31)  r6  +  (c4+c35+c36)  R7  + 
(C15+C20+C21+C26+C27+C29+C30)  R9  +  (c28XR10>  +  (c19XR11)  +  (c25XR12>  + 
(C22XR16) 


c:  D  -  (31+1 8+604-3 6)  1-6  66  +  (12+35+37+31)  2-11  +  (34+31+16+16+17+12+21)  2  -  406  + 
(59+42)  2-603  +  (82+37+45)  2-734  +  (124+75)  2-822  +  (63+29+15)  2-88  + 
(31+64+85+80+54+80+80)  2-949  +  (83*2-962)  +  (59*2-973)  +  (108*2-98)  + 
(39*2-992) 

-  4678 


Cj,  -  (4+6+17+11)  1-666  +  (2+11+12+3)  2-11  +  (10+12+3+3+4+2+6)  2-406  + 

(39+10)  2-603  +  (29+12+2)  2-734  +  (30+18)  2-822  +  (29+5+3)  2-88  + 
(9+13+16+28+17+28+28+)  2-945  +  (18*2-962)  +  (11x2-973)  +  (25*2-98)  +  (11*2-992) 


The  explicit  expressions  for  E  total,  calculated  using  all  three  strategies 
described  in  Chapter  4,  are  presented  in  this  section.  The  values  of  the  unit 
effort  for  each  nodule  in  a  program  are  also  shown.  In  order  to  determine  the 
unit  effort  for  each  module  (procedure  division  paragraph)  together  with  its 
data  division  entries,  each  module  was  run  through  the  Software  Science 
Analyzer  developed  at  Ohio  State  University  L 2 j . 


By  Equation  (4.4),  strategy  2  yields 


14 

E  «  Z  U(i)  +  1/2 [U(2)  +  U(14)  +  U(5, 7,8,9)  +  11(11,12,13) 
total  ^  . 

+  11(3, 4,5, 6, 10)  ] 

-  806751 

T2  =  806751/(18x3600)  »  12.4  hrs. 


Referring  to  the  actual  code  and  equation  (4.3),  strategy  3  gives 
14 

E  ■  Z  u(i)  +  [1/ 3U(14)  +  1/10U{(5,7,8,9)}  +  1/25{U(11, 12,13, 14) } 
^tal  i„1 

4-  12/27{U(3,4, . . . ,14)}  +  4/28{U(2,3, . . . ,14) }] 

-  977778 


977778 
"  18x3600 


15.08  hrs. 


2.  Program:  CIS  313-L2-TC0671 


The  structure  chart  for  this  program  is  shown  in  Section  3.7. 


Strategy  1 


E(Sub  10)  -  E(Sub(10, 13) ) 

-  U(10)  +  U(13)  +  KU(13) 


E(Sub  9)  -  E (Sub (9, 10, 12 ,13)) 

-  U(9)  +  U(12)  +  U(13)  +  E(Sub  10)  +  K[U(10, 12,13) ] 

E(Sub  3)  -  E(Sub(3,4,6,7,8)) 

-  U(3)  +  U(6)  +  U(7)  +  U(8)  +  U(4)  +  K[U(4, 6, 7,8) ] 

E(Sub  2)  -  U (2)  +  E(Sub  3)  +  U(4)  +  U(5)  +  E(Sub  9)  +  U(ll)  +  K[U(3,4 

Etotal  "  U(1)  +  E(Sub  2)  +  K[U(2,3 . 13)] 

13 

-  2  U(i)  +  K[U(13)  +  U(10, 12,13)  +  11(4,6,7,8) 

1-1  +  U(3, 4, . . . , 13)  +  U(2,3, . . . ,13) ] 


Strategy  2 


E(Sub  10)  -  E(Sub(10,13) ) 

-  U(10)  +  U(13)  +  KU(13) 


E(Sub  9) 

E(Sub  3) 
E(Sub  2) 

£ 

total 


E (Sub (9, 10, 12, 13)) 

U(9)  +  E(Sub  10)  +  U(12)  +  U(13)  +  K[U(10,12,13) ] 

U(3)  +  U(4)  +  U(6)  +  U(7)  +  U(8)  +  K[U(4, 6, 7,8) ] 

U(2)  +  U(ll)  +  E(Sub  3)  +  U(4)  +  U(5)  +  E(Sub  9) 

+  K[U(3,4,5,9,11)J 

U(l)  +  E(Sub  2)  +  KU(2) 

13 

2  U(i)  +  K[U(2)  +  U(13)  +  U(4, 6, 7,8) 

1-1  +  11(10,12,13)  +  11(3,4,5,9,11)  ] 


Using  Equation  (5) 


13 

L  fa,  -  2  U(i)  +  1/2[U(2)  +  U(13)  +  11(10,12,13)  +  U(4,6,7,8) 
1-1  +  U(3, 4,5, 9,11)] 

-  437857  +  1/2x579458 

-  727586 

T2  -  727586/(18x3600)  -  11.2  hrs. 


Referring  to  the  actual  code.  Equation  (4)  along  with  strategy  3  give 
13 

Etotal  "  2  U(i)  +  t2/8U(13)  +  4/24UU0, 12,13)  +  2/711(4,6,7,8) 
i”1  +  7/3211(2,3, . . .  ,13)  +  15/3111(3,4, ...  ,13)  ] 

-  1033443 


T3  -  1033443/(18x3600)  -  15.9  hrs. 


y 
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Program:  CIS  313-L2-TC0645 

The  structure  chart  for  this  program  was  shown  in  Section  3.7. 


Strategy  1 

E(Sub  9)  -  E (Sub (9,10,12,13)) 

-  U(9)  +  U(10)  +  U(12)  +  U(13)  +  K[U(10, 12,13) ] 

E(Sub  3)  -  E(Sub(3,6,7,8)) 

-  U(3)  +  U(6)  +  U(7)  +  U(8)  +  K[U(6,7,8)] 

E(Sub  2)  -  U(2)  +  U(ll)  +  E(Sub  3)  +  U(4)  +  U(5)  +  E(Sub  9) 
+  K[U(3,4, . . . ,13)] 


"total 


U(l)  +  E(Sub  2)  +  K[U(2,3 . 13)] 


2  U(i)  +  K[U96,7,8)  +  11(10, 12,13)  +  U(3,4, . . .  ,13) 
1-1  +  U(2,3,...,13)] 


E(Sub  9)  -  U(9)  +  U(10)  +  U( 12)  +  U(13)  +  K[U(10,12,13)] 

E(Sub  3)  -  U(3)  +  U(6)  +  U(7)  +  U(8)  +  K[U(6,7,8)] 

E(Sub  2)  -  U(2)  +  U(ll)  +  E(Sub  3)  +  U(4)  +  U(5) 

+  E(Sub  9)  +  K[U(3,4,5,9,11)] 

Etotal  "  U(1)  +  E(Sub  2)  +  KU(2) 


-  2  U(i)  +  K[U(2)  +  U(6,7 ,8)  +  U(10, 12,13) 
1-1  +  U(3,4,5,9,ll)] 


The  unit  efforts  for  each  module  are  listed  below. 


Module 

Number 

Effort,  E 

Union  of  the 
Modules 

Effort,  E 

1 

31325 

U(6,7,8) 

71185 

2 

102441 

11(10,12,13) 

135153 

3 

17509 

U(3,4, . . . ,13) 

653571 

4 

24823 

U(2 , 3 . 13) 

999369 

5 

49232 

U(3,4,5,9,ll) 

274528 

6 

24962 

7 

25204 

8 

42094 

9 

29456 

10 

56099 

11 

26742 

12 

26742 

13 

26742 

The  Equation  (7)  gives 


£ 

total 


T1 


13 

Z  U(i)  +  1/2  [U(6, 7, 8)  +  11(10,12,13)  +  U(3,4, . . .  ,13) 
1“1  +  U(2, 3, . . . ,13) ] 

483371  +  1/2x1859278  =  1413010 

1413010/(18x3600)  =  21.8  hrs. 


Equation  (8)  gives 
13 

Etotal  "  Z  U(i)  +  1/2tu<2^  +  U<6,7,8)  +  U(10,12,13)  +  U(3,4,5, 9,11) 
=  483371  +  1/2x583307  =  775025 


T2  -  775025/(18x3600)  -  12  hrs. 


129 


a 


1 


Referring  to  the  actual  code,  Equation  (7)  together  with  strategy  3  give 


.  -  2  U(i)  +  [(2/6)11(6,7,8)  +  4/2211(10,12,13) 
total 

1  +  15/2811(3,4,...  ,13) 


+  7/2911(2,3,. ..,13)] 


-  1121026 


T3  -  1121026/(18x3600)  *  17.3  hrs. 


4.  Program:  CIS  313-L3-TC0671 


The  structure  chart  for  the  program  is  shown  in  the  following  diagram 
The  major  steps  of  the  calculations  are  given  below. 


Strategy  1 

E(Sub  33)  -  U(33)  +  U(36)  +  KU(36) 

E(Sub  32)  *  U(32)  +  E(Sub  33)  +  U(35)  +  U(36)  +  K[U(33,35,36) ] 

E(Sub  26)  =  U(26)  +  U(27)  +  U(29)  +  U(30)  +  U(31)  +  K[U(27 ,29,30, 31) ] 

E (Sub  25)  =  U(25)  +  E(Sub  26)  +  U(27)  +  U(28)  +  E(Sub  32)  +  U(34) 

+  K[U(26,27,... ,36)] 


E(Sub  2) 

E(Sub  23) 

E(Sub  8) 

E(Sub  13) 

E(Sub  22) 

E(Sub  n) 
where  n 


U(2)  +  E(Sub  25)  +  K[ll(25,26,...,36)] 
U(23)  +  U(24)  +  KU(24) 

U(5)  +  U(8)  +  KU(5) 

U(ll)  +  U(13)  +  U(15)  +  K[U(11,15) ] 

U(22)  +  U(36)  +  KU(36) 

U(n)  +  U(14)  +  E(Sub  22)  +  K[U(14,22,36) ] 
16,17,. ...21 


21 

E(Sub  12)  =  U(12)  +  I  E(Sub  n)  +  K[U(14, 16, ... ,22,36) ] 

n**16 


E(Sub  9)  *  U(9)  +  U(ll)  +  E(Sub  12)  +  U(4)  +  K[U(4,ll,Sub  12)] 

*  U(4)  +  U(9)  +  U(ll)  +  E(Sub  12)  +  K[U(4, 11, 12 , 14 ,16, . . . ,22 ,36 

E(Sub  7)  =»  U(7)  +  E(Sub  8)  +  E(Sub  9)  +  U(10)  +  U(Sub  13) 

+  U(6)  +  K[U(4,5, 6,8 . 22,36)] 

E(Sub  3)  =  11(3)  +  U(4)  +  U(5)  +  U(6)  +  E(Sub  7)  +  K[U(4,5, ... ,22,36)] 


Etotal  =  U(1)  +  E(Sub  2)  +  E(Sub  3)  +  E(Sub  23)  +  K^u(2>3* • • • »36) 1 
36 

;*Etotal  =  2  U(i)  +  Ktu<24)  +  2U<36)  +  U(ll,15)  +  U(26, . . . ,36) 

1=1  +  U(25, . . . ,36)  +  U(33, 35,36)  +  6U(14,22,36) 

+  11(27,29,30,31)  +  11(4,11,12,14,16,..  .,22,36) 

+  U(4,5,6,8,... ,22,36)  +  U(4, 5, . . . ,22,36) 

+  U(2,3, . . . ,36)]  (10) 


4 

mt 


\  -- 


UEL 


Strategy  2 

E(Sub  33)  -  U(33)  +  U(36)  +  KU(36) 

E(Sub  32)  -  U(32)  +  U(35)  +  U(36)  +  E(Sub  33)  +  K[U(33,35,36) ] 

E(Sub  26)  =  U(26)  +  U(27)  +  U(29)  +  U(30)  +  U(31)  +  K[U(27,29,30, 31) ] 

E(Sub  25)  =  U(25)  +  U(34)  +  E(Sub  26)  +  U(27)  +  U(28)  +  E(Sub  32) 

+  K[U(26,27,28,32,34)] 

E(Sub  2)  =  U(2)  +  E(Sub  25)  +  KU(25) 

E(Sub  23)  -  U(23)  +  U(24)  +  KU(24) 

E(Sub  8)  =  U(5)  +  U(8)  +  KU(5) 

E(Sub  13)  -  U(13)  +  U(15)  +  U(ll)  +  K[U(11,15)] 

E(Sub  22)  -  U(22)  +  U(36)  +  KU(36) 


E(Sub  n) 
where  n 


E(Sub  12) 


U(n)  +  U(14)  +  E(Sub  22)  +  K[U(14,22)] 
16,17, .. .,21 


U(12)  +  2  E(Sub  n)  +  K[U(16 . 21)] 

n«16 


E(Sub  9)  -  U(9)  +  U(ll)  +  E(Sub  12)  +  U(4)  +  K[U(4,11,12) ] 


E(Sub  7) 


E(Sub  3) 


*total 


U(7)  +  E(Sub  8)  +  U(10)  +  E(Sub  9)  +  E(Sub  13)  +  U(6) 

+  K[U(6,8,9,10,13)] 

U(3)  +  U(4)  +  U(5)  +  U(6)  +  E(Sub  7)  +  K[U4,5,6,7)] 

U(l)  +  E(Sub  2)  +  E(Sub  3)  +  E(Sub  23)  +  K[U(2,3,23) ] 

36 

2  U(i)  +  K[U(24)  +  U(25)  +  U(36)  +  U(2,3,23)  +  U(4, 11,12) 

1=1  +  U(4,5,6,7)  +  U(33,35, 36)  +  U(5,8)  +  U(ll,15) 

+  11(6,8,9,10,13)  +  U(27, 29,30, 31) 

+  11(26,27,28,32,34)  +  3{U(36)  +  U(14,22)}] 


‘‘VaV.'XV  -  (vVVaV 


Strategy  3 

Referring  to  the  actual  code.  Equation  (10)  and  the  strategy  3  give 
36 


total 


2  U(i)  +  [2/8*2U(36)  +  2/7U(24)  +  3/5U(ll,15) 

i-1  +  3/37U(25, . . . ,36)  +  15/36U(26, . . . ,36) 

+  4/26U(33, 35,36)  +  2/7U(27,29,30, 31) 
+  2/26U (4 ,11,12,14,16, . . . ,22,36) 

+  2/34U(4,5,6,8,... ,22,36) 

+  2/34U(4,... ,22,36)  +  9/61U(2 . 36) 

+  3{3/16  +  2/16}U(l4 ,22,36) ] 


The  unit  efforts  for  each  module  are  listed  below. 


Module 

Nunber 


Effort,  E 


106762 


134 


Union  of  the  Modules 


U(ll, 15) 
11(14,22,36) 
U(33, 35,36) 
11(27,29,30,31) 


U(4,ll, 12,14, 16 
11(4,5,6,8,9, . . . 


U(2,3,23) 

U(4, 11,12) 

11(14,22) 

U(4,5,6,7) 

11(6,8,9,10,13) 

11(26,27,28,32,34) 


Effort,  E 


50000 

210494 

182823 

32981 

641582 

995663 

653081 

1126370 

1309907 

4417377 

39783 

171863 

139727 

145758 

171860 

46592 

206592 


The  use  of  the  numerical  values  from  these  tables  into  Equations  (10) 

(11)  and  (12)  yields  the  following  values  of  E  ,  and  Test. 

total 


Using  Equation  (10) ,  with  K=l/2 


E  .  =  1409268  +  5389964 
total 

-  6799232 


6799232/(18x3600)  =  105  hrs. 


Using  Equation  (11) ,  with  K»l/2 


E  .  -  1409268  +  1099731 
total 

=■  2508999 


2508999/(18x3600)  -  38.7  hrs. 


V 


vJHl  '»/■  DO* 
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5.  Program:  CIS  313-L3-TC0645 

The  structure  chart  for  the  program  is  shown  in  the  following  diagram. 
The  major  steps  of  the  derivation  are  given  below. 


Strategy  1 
E(Sub  3)  = 

E(Sub  9)  = 

E(Sub  2)  - 

E(Sub  31)  => 

E(Sub  19)  = 

E(Sub  23)  - 

E(Sub  n)  = 
where  n  = 

E(Sub  22)  - 

E(Sub  20)  =■ 
E(Sub  18)  » 

E 

total 


U(3)  +  U(6)  +  U(7)  +  U(8)  +  K[U(6,7,8)] 

U(9)  +  U(10)  +  U(12)  +  U(13)  +  K[U(10,12,13)] 

U(2)  +  E(Sub  3)  +  U(4)  +  U(5)  +  E(Sub  9)  +  U(ll) 

+  K[U(3, 4, . . . ,13)  ] 

U(3l)  +  U(32)  +  KU ( 32 ) 

U(15)  +  U(19)  +  KU(15) 

U(23)  +  U(30)  +  KU ( 30 ) 

U(n)  +  U(29)  +  U(30)  +  K[U(29,30)] 

24,25,26,27,28 

28 

U(22)  +  E(Sub  23)  +  2  E(Sub  n)  +  K[U(23,24 . 30)] 

n*24 

U(20)  +  U(14)  +  U(15)  +  E(Sub  22)  +  K[U(14, 15,22, ... ,30)] 

U(18)  +  E(Sub  19)  +  E(Sub  20)  +  U(21)  +  U(17) 

+  K[U(14, 15, 17,19 . 30)] 

17 

U(l)  +  E(Sub  2)  +  2  U(i)  +  E(Sub  18)  +  E(Sub  31) 

i-14 

+  K[U(2,3 . 32)] 

32 

2  U(i)  +  K[U(15)  +  U (30)  +  U(32)  +  5U(29,30) 

i“1  +  U(6,7,8)  +  11(10,12,13)  +  U(3, 4, . . .  ,13) 

+  11(14,15,22, . . .  ,30)  +  11(14,15,17,19, ..  .,30) 

+  U(2 ,3, . . . ,30) ] 


•> 


(13) 


m 


V-i, 


Strategy  2 

E(Sub  3)  -  U(3)  +  U(6)  +  U(7)  +  U(8)  +  K[U(6,7,8(] 

E(Sub  9)  -  U(9)  +  U(10)  +  U(12)  +  U(13)  +  K[U(10, 12,13)] 

E(Sub  2)  -  U(2)  +  U(ll)  +  E(Sub  3)  +  U(4)  +  U(5)  +  E(Sub  9) 

+  K[U(3, 4, 5,9,11)] 

E(Sub  31)  =  U(31)  +  U(32)  +  KU(32) 

E(Sub  19)  =  U(15)  +  U(19)  +  KU(15) 

E(Sub  23)  *  U(23)  +  U(30)  +  KU(30) 


E(Sub  n) 
where  n 


E(Sub  22) 


U(n)  +  U(29)  +  U(30)  +  Kll(29,30) 
24,..., 28 


U(22)  +  E (Sub  23)  +  2  E(Sub  n)  +  K[U(23 . 28)] 

n-24 


E(Sub  20)  -  U(20)  +  U(14)  +  U(15)  +  E(Sub  22)  +  K[U(14,15,22)] 


E(Sub  18) 


'total 


U(18)  +  E(Sub  19)  +  E(Sub  20)  +  U(21)  +  U(17) 
+  K  [11(17,19, 20, 21)  ] 


U(l)  +  E(Sub  2)  +  2  U(i)  +  E(Sub  18)  +  E(Sub  31) 
i=*14 


+  K[U(2, 14,.. .,18,31)] 


2  U(l)  +  K[U(15)  +  U(32)  +  5U(29,30)  +  U(6,7,8) 
i-1 

+  U(10, 12,13)  +  U(14,15, 22)  +  U(17,19,20,21) 
+  U(23,  —  ,28)  +  11(3,4,5,9,11) 

+  U(2,14,... ,18,31)] 


Union  of  the  Modules 


U(29, 30) 

U(6,7,8) 

U(10, 12,13) 
U(3,4,...,13) 

U(14, 15,22 , . . . , 30) 
11(14,15,17,19, ...  ,30) 
U(2 ,3, . . . ,32) 

U(14, 15,22) 
11(17,19,20,21) 

U(23, . . . ,28) 
11(3,4,5,9,11) 

U(2,14, . . . ,18,31) 


Effort,  E 


116971 

730000 

643416 

1188818 

5250384 

89934 

265483 

283347 

231504 

677593 


The  use  of  these  numerical  values  into  Equations  (13),  (14)  and  (15) 
yields  the  following  values  of  Etota^  and  Test. 

Using  Equation  (13),  with  K“l/2 

Ek  _  -  2613539  +  4250083 

total 

-  6863622 

T1  -  6863622/(18x3600)  =  105.9  hrs. 


Using  Equation  (14),  with  K>l/2 


E  ,  -  2613539  +  1091886 
total 

-  3705425 


3705425/(18x3600)  =  57  hrs. 


Using  Equation  (15), 


E  -  ■  2613539  +  2606661 
total 

-  5220200 


5220200/(18x3600)  -  80  hrs. 
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6.  Program:  CIS  313-L3-TC0622 

The  structure  chart  for  this  program  is  shown  in  the  following  diagram. 
The  major  steps  of  the  calculations  are  described  below. 


- - 

I 

E(Sub  3) 

r.’ 

E(Sub  5) 

Si 

E(Sub  11) 

E(Sub  4) 

E(Sub  33) 

lift 

E(Sub  19) 

... 

E(Sub  21) 

E(Sub  35) 

Q 

E(Sub  n) 

where  n 

g 

E(Sub  27) 

s* 

E(Sub  23) 

;«.f 

E(Sub  20) 

_  m 

E(Sub  18) 

'total 


U(2)  +  U(3)  +  KU(2) 

U(5)  +  U(6)  +  U(7)  +  U(8)  +  U(9)  +  K[U(6, 7, 8,9) ] 

U(ll)  +  U(12)  +  U(13)  +  U(14)  +  K[U(12, 13,14)] 

U(4)  +  E(Sub  3)  +  E(Sub  5)  +  U(9)  +  U(10) 

+  E(Sub  11)  +  K[ U(2 ,3 ,5 , 6, ... ,14)  ] 

U(33)  +  U(34)  +  KU(34) 

U(16)  +  U(19)  +  KU(16) 

U(21)  +  U(22)  +  U(24)  +  K[U(22,24)] 

U(13)  +  U(35)  +  KU(13) 

U(n)  +  U(31)  +  U(22)  +  E(Sub  35)  +  U(36) 

+  K[  11(13,22,31, 35, 36)  ] 

25,26,28,29,30 

U(27)  +  U(31)  +  U(32)  +  E(Sub  35)  +  U(36)  +  U(22) 

+  K  [11(13,22,31, 32, 35, 36)] 

30 

U(23)  +  2  E(Sub  j)  +  K [11(13,22,25, ...  ,31,35,36)  ] 
j-25 

U(20)  +  U(22)  +  U(24)  +  E(Sub  23)  +  U(15) 

+  K[U(13,15,22,.. .,31,35,36)] 

U(18)  +  E(Sub  19)  +  E(Sub  20)  +  E(Sub  21)  +  U(17) 

+  K[U(13,15,16,17,19, . . . ,31,35,36)] 

U(l)  +  E(Sub  4)  +  U(15)  +  U(16)  +  U(17)  +  E(Sub  18) 

+  E(Sub  33)  +  K[U(2, 3, . . . ,36)] 

36 

2  U(i)  +  K[U(2)  +  U(13)  +  U(34)  +  U(12,13,14) 

i“1  +  U(6, 7,8,9)  +  U(2, 3,5,.. .,14) 

+  U(13, 22, 31,32,35,36)  +  511(13,22,31,  35,36) 

+  U(13,22,25,... ,31,35,36)  +  11(13,15,22, ... ,31, 35,36) 

+  U(13,15,16,17,19,.. .,31,35,36) 

+  11(2,3,.. .,36)]  (16) 
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Strategy  2 

s 

E(Sub  3)  > 

f? 

E(Sub  5)  ■ 

V, 

w 

E(Sub  11)  « 

'V* 

E(Sub  4)  ■ 

& 

E(Sub  33)  - 

>} 

E(Sub  19)  = 

E(Sub  21)  » 

■ 

E(Sub  35)  * 

•V 

\0' 

E(Sub  n)  ■ 

where  n  a 

i 

E(Sub  27)  - 

$ 

E(Sub  23)  - 

*? 

E(Sub  20)  - 

E(Sub  18)  = 

“total 


U(2)  +  U(3)  +  KU(2) 

U(5)  +  U(6)  +  U(7)  +  U(8)  +  U(9)  +  K[U(6,7,8,9)] 

U(ll)  +  U(12)  +  U(13)  +  U(14)  +  K[U(12, 13,14)] 

U(4)  +  E(Sub  3)  +  E(Sub  5)  +  U(9)  +  U(10)  +  E(Sub  11) 
+  K[U(3,5,9,10,11)] 

U(33)  +  U(34)  +  KU(34) 

U(16)  +  U(19)  +  KU(16) 

U(21)  +  U922)  +  U(24)  +  K[U(22,24)] 

U(13)  +  11(35)  +  KU(13) 

U(n)  +  U(31)  +  U(22)  +  E(Sub  35)  +  U(36) 

+  K[U(22, 31,35, 36)] 

25,26,28,29,30 

U(27)  +  U(31)  +  U(32)  +  E(Sub  35)  +  U(36)  +  U(22) 

+  K[U(22, 31,32, 35,36) ] 


U(23)  +  2  E(Sub  j)  +  K[U(25, . . . , 30) 
j-25 

U(20)  +  U(22)  +  U(24)  +  E(Sub  23)  +  U(15) 

+  K[U(15, 22,23,24)] 

U(18)  +  E(Sub  19)  +  E(Sub  20)  +  E(Sub  21) 

+  U(17)  +  K[U(17,19,20 ,21) ] 

U(l)  +  E(Sub  4)  +  U(15)  +  U(16)  +  U(17)  +  E(Sub  18) 
+  E(Sub  33)  +  K[U(4,15,... ,18,33)] 


2  U(i)  +  K[U(2)  +  U(13)  +  U(16)  +  U(34) 

1-1  +  U(22 ,24)  +  U(6 ,7 ,8,9)  +  U(12,13,14) 

+  5U(22,31,35,36)  +  U(3,5,9,10,ll) 

+  11(15,22,23,24)  +  U(22,31,32,35,36) 
+  U(17 ,19,20,21)  +  U(25,26 . 30)] 
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Strategy  3 

Referring  to  the  actual  code,  Equation  (16)  and  strategy  3  give 


Jtotal 


36 

2 

i=»l 


U(i) 


+  [1/2U(2)  +  2/8U(13)  +  2/7U(34) 

+  2/6U(6,7,8,9)  +  5/2111(12,13,14) 

+  13/4111(2,3,5,6, . .  •  ,14) 

+  5/19U(13, 22, 31,32 ,35,36) 

+  1/24U(13,22,25,.. .,32,35,36) 

+  6/2611(13,15,22,..  .,32,35,36) 

+  2/31(J(13,15, 16,17, 19 . 32,35,36) 

+  13/5211(2,3, . . .  ,36) 

+  5*5/18U(13,22,31, 35,36) ] 


The  unit  efforts  for  each  module  are  listed  below. 


(18) 


Module 

Number 

Effort,  E 

Module 

Number 

Effort,  E 

1 

254810 

19 

37967 

2 

3950 

20 

54702 

3 

17641 

21 

83802 

4 

84206 

22 

77958 

5 

20724 

23 

18400 

6 

6833 

24 

48179 

7 

7184 

25 

133481 

8 

10774 

26 

70769 

9 

5362 

27 

44601 

10 

12442 

28 

85176 

11 

106383 

29 

70769 

12 

18387 

30 

68746 

13 

22949 

31 

30164 

14 

55706 

32 

44229 

15 

18087 

33 

55758 

16 

29017 

34 

18486 

17 

48179 

35 

15638 

18 

48179 

36 

_ 

4818 

Union  of  the  Modules 


Effort,  E 


U(6, 7,8,9) 

27626 

11(12,13,14) 

127210 

U(13,22,31, 35,36) 

187760 

11(13,22,31, 32 ,35,36) 

246134 

11(13,22,25...  .,31,35,36) 

580819 

11(13,15,22,..  .,31,35,36) 

1111212 

U(13, 15, 16, 17, 19,..., 31, 35, 36) 

1706111 

U(2,3, . . . .36) 

5397419 

U(2, 3, 5, . . . , 14) 

712167 

11(22,24) 

82369 

U(22, 31, 35,36) 

154825 

U(22, 31,32, 35,36) 

207027 

U(17, 19,20,21) 

223411 

11(15,22,23,24) 

233153 

11(25, . . .  ,30) 

348113 

11(3,5,9,10,11) 

308263 

The  use  of  these  numerical  values  into  Equations  (16),  (17)  and  (18) 
yields  the  following  values  of  Etota^  antl  Test. 

Using  Equation  (16),  with  K=l/2 

E  ,  -  1734458  +  5446441  =  7180899 
total 

T1  -  7180899/(18x3600)  =  110  hrs. 


Using  Equation  (17),  with  K=l/2 

E  ,  -  1734458  +  1202849  =  2937307 
total 

T2  -  2937307/(18x3600)  =  45  hrs. 


Using  Equation  (18) 

E  ,  -  1734458  +  2343908  -  4078366 
total 

T3  -  4078366/(18x3600)  -  63  hrs. 


